JP4239557B2

JP4239557B2 - Image processing apparatus and method, recording medium, and program

Info

Publication number: JP4239557B2
Application number: JP2002327271A
Authority: JP
Inventors: 哲二郎近藤; 靖立平; 淳一石橋; 成司和田; 泰広周藤
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2002-11-11
Filing date: 2002-11-11
Publication date: 2009-03-18
Anticipated expiration: 2022-11-11
Also published as: JP2004165839A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理装置および方法、記録媒体、並びにプログラムに関し、例えば、時間的に連続する画像対の画素同士をマッチングさせ、マッチングの結果に基づいて画素単位の動きベクトルを検出する場合に用いて好適な画像処理装置および方法、記録媒体、並びにプログラムに関する。
【０００２】
【従来の技術】
例えば、MPEG２(Moving Picture Experts Group)方式等の画像信号を圧縮符号化する処理においては、隣接する２フレーム間の相関関係に基づく符号化処理、いわゆる動き補償フレーム間予測が用いられている。動き補償フレーム間予測では、隣接する２フレーム（一方を対象フレーム、他方を参照フレームと記述する）の間における各画素単位または所定サイズの画素ブロック単位の動きベクトルを検出する処理が必要となる。
【０００３】
動きベクトルを検出する方法としては、従来、ブロックマッチングアルゴリズムと称される方法が用いられている（例えば、特許文献１参照）。
【０００４】
図１は、ブロックマッチングアルゴリズムに従って動きベクトルを検出する動きベクトル検出装置の構成例を示している。この動きベクトル検出装置１は、入力される画像信号を１フレーム分だけ保持して後段に出力するフレームメモリ２，３、およびフレームメモリ２，３からそれぞれ１フレームずつ入力される画像信号に基づき、フレームメモリ２から入力される画像信号の画像内の動きベクトルを検出する検出部４から構成される。
【０００５】
フレームメモリ２は、入力された画像信号を１フレームだけ保持し、次のフレームの画像信号が入力されたとき、保持していた画像信号をフレームメモリ３および検出部４に出力する。フレームメモリ３は、フレームメモリ２から入力された画像信号を１フレーム分だけ保持し、次のフレームの画像信号が入力されたとき、保持していた画像信号を検出部４に出力する。
【０００６】
したがって、検出部４には、前後する２フレームの画像信号が入力されることになる。以下、フレームメモリ２から検出部４に入力された１フレームの画像信号の画像を、対象フレームＦｃと記述する。また、フレームメモリ３から検出部４に入力された、対象フレームＦｃよりも１フレームだけ以前の画像信号の画像を、参照フレームＦｒと記述する。
【０００７】
検出部４は、ブロックマッチングアルゴリズムに従い、対象フレームＦｃ内の動きベクトルを算出する。ブロックマッチングアルゴリズムについて、対象フレームＦｃと参照フレームＦｒの対応関係を示す図２および処理に順序を示す図３のフローチャートを参照して説明する。
【０００８】
ブロックマッチングアルゴリズムでは、対象フレームＦｃ内の全ての画素が、順次、注目画素に指定され、注目画素を中心とする所定サイズ（Ｌ×Ｌ画素）の基準ブロックと、参照フレームＦｒに設けられたサーチエリアＳＲ内で移動される参照ブロック（基準ブロックと同じサイズ）との対応する画素対の画素値の差分絶対値和が次式（１）に従って演算される。
Σ_iΣ_j＝｜Ｆｃ（ｉ，ｊ）−Ｆｒｎ（ｉ，ｊ）｜・・・（１）
【０００９】
ただし、Ｆｃ（ｉ，ｊ）は基準ブロックの画素の画素値であり、Ｆｒｎ（ｉ，ｊ）は識別番号ｎの参照ブロックの画素の画素値である。また、Σ_iはｉを１からＬまで１ずつインクリメントしたときの総和演算を意味し、Σ_jはｊを１からＬまで１ずつインクリメントしたときの総和演算を意味するものとする。
【００１０】
そして、基準ブロックと参照ブロックとの対応する画素対の画素値の差分絶対値和が最小となるときの参照ブロックの中心座標と注目画素の座標との差分ベクトルが動きベクトルとして算出される。
【００１１】
具体的には、対象フレームＦｃの注目画素に対して、以下の処理が実行される。ステップＳ１において、検出部４は、参照フレームＦｃに、対象フレームＦｃの注目画素の座標と同じ座標を中心として、基準ブロックよりも大きなサイズのサーチエリアＳＲを設定する。
【００１２】
ステップＳ２において、検出部４は、差分絶対値和の最小値を格納する変数ｍｉｎを、その最大値に初期化する。例えば、１画素の画素値が８ビット、基準ブロックのサイズが４×４画素である場合、変数ｍｉｎが、４０９６（＝２⁸×１６）に初期化される。
【００１３】
ステップＳ３において、検出部４は、サーチエリアＳＲ内で移動させる参照ブロックの識別番号ｎを１に初期化する。ステップＳ４において、検出部４は、差分絶対値和の演算結果を格納する変数ｓｕｍを０に初期化する。
【００１４】
ステップＳ５において、検出部４は、対象フレームＦｃの基準ブロックと、参照フレームＦｒに設定したサーチエリアＳＲ内の識別番号ｎの参照ブロックとの対応する場所に位置する画素対の画素値の差分絶対値和を演算して、変数ｓｕｍに代入する。ステップＳ６において、検出部４は、ステップＳ５の演算結果である変数ｓｕｍと変数ｍｉｎを比較し、変数ｓｕｍが変数ｍｉｎよりも小さいか否かを判定する。変数ｓｕｍが変数ｍｉｎよりも小さいと判定された場合、処理はステップＳ７に進む。
【００１５】
ステップＳ７において、検出部４は、変数ｍｉｎを、変数ｓｕｍで置換する。また、検出部４は、いまの参照ブロックの識別番号ｎを、動きベクトル番号として記憶する。
【００１６】
ステップＳ８において、検出部４は、参照ブロックの識別番号ｎが最大値であるか否か、すなわち、サーチエリアＳＲの全域に参照ブロックを移動し終えたか否かを判定する。参照ブロックの識別番号ｎが最大値ではないと判定された場合、処理はステップＳ９に進む。ステップＳ９において、検出部４は、参照ブロックの識別番号ｎを１だけインクリメントし、ステップＳ４の処理に戻って、それ以降の処理を繰り返す。
【００１７】
なお、ステップＳ６において、変数ｓｕｍが変数ｍｉｎよりも小さくないと判定された場合、処理はステップＳ７の処理はスキップされる。
【００１８】
その後、ステップＳ８において、参照ブロックの識別番号ｎが最大値ではないと判定された場合、すなわち、サーチエリアＳＲの全域に参照ブロックを移動し終えた場合、処理はステップＳ１０に進む。
【００１９】
ステップＳ１０において、検出部４は、動きベクトル番号として記憶している識別番号ｎに対応する参照ブロックの中心座標と、対象フレームＦｃの注目画素の座標との差分ベクトルを、対象フレームＦｃの注目画素の動きベクトルとして算出する。以上で、ブロックマッチングアルゴリズムの説明を終了する。
【００２０】
【特許文献１】
特許第３２７７４１７号公報
【００２１】
【発明が解決しようとする課題】
上述したブロックマッチングアルゴリズムは、ステップＳ５の処理における画素対の画素値の差分絶対値和の演算量が非常に膨大となっており、画像圧縮処理に要する時間の大半がこの演算に費やされてしまうという課題があった。
【００２２】
本発明はこのような状況に鑑みてなされたものであり、僅かな演算量だけで画像間のマッチングを行うことができ、しかも動きベクトル等を精度良く検出できるようにすることを目的とする。
【００２３】
【課題を解決するための手段】
本発明の画像処理装置は、入力された画像において注目する注目画素と、その近傍の画素とからなる所定のサイズのブロックを設定する設定手段と、設定手段によって設定されたブロックに含まれる複数の画素の画素値の最小値に、ブロックに含まれる複数の画素の画素値の最大値と、最小値との差で表されるダイナミックレンジの１／２を加算して、閾値を算出する算出手段と、サンプルとなる複数の画像について、所定のサイズのブロックを設定し、設定したブロックに対応するクラスコードを生成し、前フレームの画像についてのクラスコードと比較することで得られるクラスコードの中のビット反転の発生頻度の累計を所定の値に決定するビット数と、クラスコードとが対応付けられた対応テーブルを保持する保持手段と、保持手段によって保持された対応テーブルを参照して、生成されるクラスコードに対応するビット数を取得する取得手段と、設定手段によって設定されたブロックに含まれる複数の画素の画素値を、それぞれ、算出手段によって算出された閾値よりも大きい場合には１に符号化し、閾値よりも小さい場合には０に符号化して、注目画素に対するクラスコードを生成し、さらに、生成したクラスコードに対応するブロックに含まれる複数の画素の画素値のうち、算出手段によって算出された閾値に近い方から、取得手段によって取得されたビット数の画素の画素値を、それぞれ０および１の２通りに符号化する生成手段とを備える。
【００２７】
本発明の画像処理方法は、入力された画像において注目する注目画素と、その近傍の画素とからなる所定のサイズのブロックを設定する設定ステップと、設定ステップの処理で設定されたブロックに含まれる複数画素の画素値の最小値に、ブロックに含まれる複数の画素の画素値の最大値と、最小値との差で表されるダイナミックレンジの１／２を加算して、閾値を算出する算出ステップと、サンプルとなる複数の画像について、所定のサイズのブロックを設定し、設定したブロックに対応するクラスコードを生成し、前フレームの画像についてのクラスコードと比較することで得られるクラスコードの中のビット反転の発生頻度の累計を所定の値に決定するビット数と、クラスコードとが対応付けられた対応テーブルを参照して、生成されるクラスコードに対応するビット数を取得する取得ステップと、設定ステップの処理で設定されたブロックに含まれる複数の画素の画素値を、それぞれ、算出ステップの処理で算出された閾値よりも大きい場合には１に符号化し、閾値よりも小さい場合には０に符号化して、注目画素に対するクラスコードを生成し、さらに、生成したクラスコードに対応するブロックに含まれる複数の画素の画素値のうち、算出ステップの処理で算出された閾値に近い方から、取得ステップの処理で取得されたビット数の画素の画素値を、それぞれ０および１の２通りに符号化する生成ステップとを含む。
【００２８】
本発明の記録媒体のプログラムは、入力された画像において注目する注目画素と、その近傍の画素とからなる所定のサイズのブロックを設定する設定ステップと、設定ステップの処理で設定されたブロックに含まれる複数画素の画素値の最小値に、ブロックに含まれる複数の画素の画素値の最大値と、最小値との差で表されるダイナミックレンジの１／２を加算して、閾値を算出する算出ステップと、サンプルとなる複数の画像について、所定のサイズのブロックを設定し、設定したブロックに対応するクラスコードを生成し、前フレームの画像についてのクラスコードと比較することで得られるクラスコードの中のビット反転の発生頻度の累計を所定の値に決定するビット数と、クラスコードとが対応付けられた対応テーブルを参照して、生成されるクラスコードに対応するビット数を取得する取得ステップと、設定ステップの処理で設定されたブロックに含まれる複数の画素の画素値を、それぞれ、算出ステップの処理で算出された閾値よりも大きい場合には１に符号化し、閾値よりも小さい場合には０に符号化して、注目画素に対するクラスコードを生成し、さらに、生成したクラスコードに対応するブロックに含まれる複数の画素の画素値のうち、算出ステップの処理で算出された閾値に近い方から、取得ステップの処理で取得されたビット数の画素の画素値を、それぞれ０および１の２通りに符号化する生成ステップとを含む処理をコンピュータに実行させる。
【００２９】
本発明のプログラムは、入力された画像において注目する注目画素と、その近傍の画素とからなる所定のサイズのブロックを設定する設定ステップと、設定ステップの処理で設定されたブロックに含まれる複数画素の画素値の最小値に、ブロックに含まれる複数の画素の画素値の最大値と、最小値との差で表されるダイナミックレンジの１／２を加算して、閾値を算出する算出ステップと、サンプルとなる複数の画像について、所定のサイズのブロックを設定し、設定したブロックに対応するクラスコードを生成し、前フレームの画像についてのクラスコードと比較することで得られるクラスコードの中のビット反転の発生頻度の累計を所定の値に決定するビット数と、クラスコードとが対応付けられた対応テーブルを参照して、生成されるクラスコードに対応するビット数を取得する取得ステップと、設定ステップの処理で設定されたブロックに含まれる複数の画素の画素値を、それぞれ、算出ステップの処理で算出された閾値よりも大きい場合には１に符号化し、閾値よりも小さい場合には０に符号化して、注目画素に対するクラスコードを生成し、さらに、生成したクラスコードに対応するブロックに含まれる複数の画素の画素値のうち、算出ステップの処理で算出された閾値に近い方から、取得ステップの処理で取得されたビット数の画素の画素値を、それぞれ０および１の２通りに符号化する生成ステップとを含む処理をコンピュータに実行させる。
【００３０】
本発明の画像処理装置および方法、記録媒体、並びにプログラムにおいては、注目画素を含む所定のサイズのブロックに含まれる複数の画素の画素値が、算出された閾値よりも大きい場合には１に符号化され、閾値よりも小さい場合には０に符号化されて注目画素に対するクラスコードが生成され、さらに、生成されたクラスコードに対応するブロックに含まれる複数の画素の画素値のうち、算出された閾値に近い方から、取得されたビット数の画素の画素値が、それぞれ０および１の２通りに符号化される。
【００３１】
【発明の実施の形態】
本発明を適用した動きベクトル検出装置の構成例について、図４を参照して説明する。この動きベクトル検出装置１１は、フレームメモリ１２，１３、クラスコード生成部１４、テーブルメモリ１５、ＭＥメモリ１６、および動きベクトル演算部１７から構成される。
【００３２】
フレームメモリ１２は、入力された画像信号を１フレームだけ保持し、次のフレームの画像信号が入力されたとき、保持していた画像信号をフレームメモリ１３およびクラスコード生成部１４に出力する。フレームメモリ１３は、入力された画像信号を１フレームだけ保持し、次のフレームの画像信号がフレームメモリ１２から入力されたとき、保持していた画像信号をクラスコード生成部１４に出力する。
【００３３】
したがって、クラスコード生成部１４には、時間的に前後する２フレームの画像信号が入力されることになる。以下、フレームメモリ１２からクラスコード生成部１４に入力される１フレームの画像信号の画像を、対象フレームＦｃと記述する。また、フレームメモリ１３から検出部４に入力される、対象フレームＦｃよりも１フレームだけ以前の画像信号の画像を、参照フレームＦｒと記述する。
【００３４】
クラスコード生成部１４は、対象フレームＦｃの全ての画素に対し、それぞれ、近傍の画素の画素値を用いた空間的な特徴を示すクラスコードを１種類だけ生成して、動きベクトル演算部１７に出力する。具体的には、対象フレームＦｃの全ての画素を、順次、注目画素に設定し、注目画素を中心とする所定サイズのクラスコード用タップ（例えば、図５に示すような３×３画素）を決定して、クラスコード用タップに含まれる複数の画素（図５の場合、９画素）の画素値を、１ビットADRC（Adaptive Dynamic Range Coding）によってそれぞれ０また１に符号化して、所定のビット数（図５の場合、９ビット）のクラスコードを生成する。
【００３５】
また、クラスコード生成部１４は、参照フレームＦｒの全ての画素に対し、それぞれ、近傍の画素の画素値を用いた空間的な特徴を示すクラスコードを１種類だけ生成する。ただし、参照フレームＦｒは、１回前のタイミングにおいて対象フレームＦｃであったものであるので、参照フレームＦｒの各画素に対する１種類のクラスコードは、既に１回前のタイミングにおいて生成されている。よって、１回前のタイミングにおいて対象フレームＦｃの各画素に対して生成されたクラスコードを保持するようにして、今回のタイミングにおける参照フレームＦｒの各画素に対するクラスコードとして流用するようにしてもよい。
【００３６】
さらに、クラスコード生成部１４は、テーブルメモリ１５に予め保持されている、第１または第２のテーブルに基づいて、生成した１種類のクラスコードを構成する複数のビットのうち、所定のビットをそれぞれ０にした場合と１にした場合の２通りを場合分けして、１種類以上のクラスコードを生成し、ＭＥメモリ１６に出力する。なお、参照フレームＦｒの画素に対してクラスコードを生成する処理については後述する。
【００３７】
テーブルメモリ１５には、クラスコードと、０と１の２通りに場合分けして符号化させるビット数との対応を示す第１のテーブルと、クラスコードとクラスコード用タップにおける各画素位置のビット反転の発生の頻度を示す第２のテーブルが記憶されている。なお、第１および第２のテーブルは、サンプルとなる複数の画像信号を用いて予め統計的に生成されて、テーブルメモリ１５に予め記憶されているものとする。
【００３８】
第１のテーブルを生成する処理について、図６乃至図８を参照して説明する。まず、サンプルとなる複数の画像信号の各画素に対して、クラスコード用タップが設定されて、ADRCによりクラスコードが生成される。次に、生成された全てのクラスコードがそれぞれ元となったクラスコード用タップ内の複数の画素値と比較されて、クラスコードの中でビット反転が生じているビットが判定され、そのビット数が計数される。そして、その計数結果に基づき、図６に示すような、横軸にビット数、縦軸に正規化されたビット反転の発生頻度を示すヒストグラムが生成される。
【００３９】
なお、図６には、４種類のクラスコードに対する計数結果を示す折れ線ａ乃至ｄが示されているが、実際には、例えば、クラスコードが９ビットである場合、５１２（＝２⁹）種類のクラスコードに対する計数結果をそれぞれ示す５１２本の折れ線が得られることになる。さらに、各クラスコードについて、ビット反転の発生頻度の累計が所定の値となるようにビット数Ｋが決定されて、第１のテーブルに書き込まれる。なお、当該所定の値は、全てのクラスコードに対して共通の値が用いられる。
【００４０】
例えば、当該所定の値が４０％と定められた場合、折れ線ｄに対応するクラスコードに対しては、図７に示すように、ビット数Ｋ＝１が決定される。また、例えば、当該所定の値が８０％と定められた場合、折れ線ｄに対応するクラスコードに対しては、図８に示すように、ビット数Ｋ＝３が決定される。
【００４１】
図９は、図５のクラスコード用タップが用いられて生成された９ビットのクラスコードに対応する第１のテーブルの一例を示している。この例では、クラスコード100011111とビット数Ｋ＝３が対応付けられていることを示している。また、クラスコード101001111とビット数Ｋ＝２が対応付けられていることを示している。
【００４２】
次に、第２のテーブルを生成する処理について、図１０を参照して説明する。まず、複数のサンプルの画像信号の各画素に対して、クラスコード用タップが設定されて、ADRCによりクラスコードが生成される。次に、生成された全てのクラスコードがそれぞれ元となったクラスコード用タップ内の複数の画素値と比較されて、クラスコードの中でビット反転が生じているビットが判定され、ビット反転が生じているビットに対する画素のクラスコード用タップにおける位置が判定される。そして、その判定結果に基づき、図１０に示すような、横軸にクラスコード用タップにおける画素の位置、縦軸に正規化されたビット反転の頻度を示すヒストグラムが生成されて、第２のテーブルが生成される。
【００４３】
なお、図１０には、３種類のクラスコードに対する判定結果が折れ線ｅ乃至ｇを用いて示されているが、実際には、例えば、クラスコードが９ビットである場合、５１２（＝２⁹）種類のクラスコードに対する判定結果をそれぞれ示す５１２本の折れ線が得られることになる。
【００４４】
図１１は、図５のクラスコード用タップが用いられて生成された９ビットのクラスコードに対応する第２のテーブルの一例を示している。第２のテーブルには、９ビットの各クラスコードにそれぞれ対応して、９カ所の画素位置のビット反転の発生頻度が格納されることになる。
【００４５】
例えば、クラスコード000000000については、注目画素の左上の画素値Ｐ１を符号化した際のビット反転の発生頻度が０であり、注目画素の真上の画素値Ｐ２を符号化した際のビット反転の発生頻度が0.05であり、注目画素の右上の画素値Ｐ３、真左の画素値Ｐ４、および注目画素の画素値Ｐ５を符号化した際のビット反転の発生頻度が０であり、注目画素の真右の画素値Ｐ６を符号化した際のビット反転の発生頻度が0.06であり、注目画素の左下の画素値Ｐ７、真下の画素値Ｐ８、および右下の画素値Ｐ９を符号化した際のビット反転の発生頻度が０であることを示している。
【００４６】
また、例えば、クラスコード101000110については、注目画素の左上の画素値Ｐ１を符号化した際のビット反転の発生頻度が0.09であり、注目画素の真上の画素値Ｐ２を符号化した際のビット反転の発生頻度が0.55であり、注目画素の右上の画素値Ｐ３を符号化した際のビット反転の発生頻度が0.04であり、注目画素の真左の画素値Ｐ４を符号化した際のビット反転の発生頻度が0.3であり、注目画素の画素値Ｐ５を符号化した際のビット反転の発生頻度が0.1であり、注目画素の真右の画素値Ｐ６を符号化した際のビット反転の発生頻度が0.09であり、注目画素の左下の画素値Ｐ７を符号化した際のビット反転の発生頻度が0.15であり、注目画素の真下の画素値Ｐ８および右下の画素値Ｐ９を符号化した際のビット反転の発生頻度が0.08であることを示している。
【００４７】
図４に戻る。ＭＥメモリ１６は、クラスコード生成部１４から入力される参照フレームＦｒの各画素に対する１種類以上のクラスコードを、クラスコードに各画素の座標を対応付けて記憶する。
【００４８】
図１２は、ＭＥメモリ１６の構造を示している。ＭＥメモリ１６は、特徴量アドレス０乃至ａと、フラグアドレス０乃至ｂによって示される（ａ＋１）×（ｂ＋１）個のセルにより構成される。以下、例えば、特徴量アドレス１、フラグアドレス２のセルをセル（１，２）と記述する。
【００４９】
特徴量アドレスは、クラスコード生成部１４によって生成される参照フレームＦｒの各画素に対して生成されたクラスコードに対応する。例えば、クラスコードが９ビットである場合、特徴量アドレスの最大値ａ＝２⁹となる。
【００５０】
特徴量アドレス０のフラグアドレス１以降のセルには、クラスコード生成部１４によってクラスコード０が生成された参照フレームＦｒの画素の座標がラスタ順に格納される。特徴量アドレス０のフラグアドレス０のセル（０，０）には、クラスコード０の画素の座標が既に格納されている、特徴量アドレス０のフラグアドレス１以降のセル、すなわち、使用済のセルの数が格納される。例えば、参照フレームＦｒの画素のうち、３画素に対してそれぞれクラスコード０が生成された場合、セル（０，１）、セル（０，２）、セル（０，３）に３画素それぞれの座標が格納され、セル（０，０）に座標が格納されているセルの数である３が格納される。
【００５１】
特徴量アドレス１のフラグアドレス１以降のセルには、クラスコード生成部１４によってクラスコード１が生成された参照フレームＦｒの画素の座標がラスタ順に格納される。特徴量アドレス１のフラグアドレス０のセル（１，０）には、クラスコード１の画素の座標が格納されている、特徴量アドレス０のフラグアドレス１以降のセルの数が格納される。例えば、参照フレームＦｒの画素のうち、１０画素に対して同一のクラスコード１が生成された場合、セル（１，１）、…、セル（１，１０）に１０画素それぞれの座標が格納され、セル（１，０）に、座標が格納されているセルの数である１０が格納される。特徴量アドレス２以降のセルについても同様であるので、その説明は省略する。
【００５２】
また、例えば、参照フレームＦｒのある画素に対して３種類のクラスコードが生成された場合、当該画素の座標が、３種類のクラスコードに対応する特徴量アドレスの３つのセルに格納されることになる。
【００５３】
図４に戻る。動きベクトル演算部１７は、対象フレームＦｃの各画素について、同じクラスコードが生成された参照フレームＦｒの画素の座標を、ＭＥメモリ１６から検索し、検索された画素のうち、注目画素の座標に最も距離が近い座標の画素を、対象フレームＦｃの注目画素に対応する参照フレームＦｒの画素に決定して、注目画素の動きベクトルを算出する。
【００５５】
次に、クラスコード生成部１４による第１のテーブルを用いたクラスコード生成処理について、図１３のフローチャートを参照して説明する。この処理は、参照フレームＦｒの各画素が、順次、注目画素に指定され、注目画素に対して実行されるものである。
【００５６】
ステップＳ１１において、クラスコード生成部１４は、注目画素を中心とする所定サイズのクラスコード用タップを設定し、クラスコード用タップに含まれる複数の画素の画素値を取得する。以下においては、図５に示されたように、クラスコード用タップのサイズを３×３画素とし、注目画素の左上、真上、右上、真左、注目画素自身、真右、左下、真下、右下の画素の画素値を、それぞれＰ１乃至Ｐ９として説明を継続する。
【００５７】
ステップＳ１２において、クラスコード生成部１４は、画素値Ｐ１乃至Ｐ９の最大値Ｐ_MAXと最小値Ｐ_MINを判定する。ステップＳ１３において、クラスコード生成部１４は、画素値Ｐ１乃至Ｐ９のダイナミックレンジＤＲ（＝｜最大値Ｐ_MAX−最小値Ｐ_MIN｜）を算出する。ステップＳ１４において、クラスコード生成部１４は、次式（２）のように、画素値Ｐ１乃至Ｐ９の最小値Ｐ_MINに、ダイナミックレンジＤＲ／２を加算して閾値Ｔｈを算出する。
Ｔｈ＝Ｐ_MIN＋ＤＲ／２・・・（２）
【００５８】
ステップＳ１５において、クラスコード生成部１４は、画素値Ｐ１乃至Ｐ９を、それぞれ閾値Ｔｈと比較し、閾値Ｔｈよりも大きい場合には１に符号化し、閾値Ｔｈよりも小さい場合には０に符号化して画素の配置の順に並べ、９ビットのクラスコードを生成する。
【００５９】
なお、ステップＳ１１乃至Ｓ１５の処理は、対象フレームＦｃの各画素に対してクラスコードを生成する処理として、現在の参照フレームＦｒが１回前のタイミングにおいて対象フレームＦｃであったときに実行されているので、その結果を流用してもよい。
【００６０】
ステップＳ１６において、クラスコード生成部１４は、テーブルメモリ１５に記憶されている第１のテーブルを参照して、ステップＳ１５の処理で生成したクラスコードに対応するビット数Ｋを取得する。なお、ここで取得したビット数Ｋ＝０である場合、以下に述べるステップＳ１７の処理は省略される。
【００６１】
ステップＳ１７において、クラスコード生成部１４は、画素値Ｐ１乃至Ｐ９のうち、閾値Ｔｈに最も近いＫ個の画素値については、それぞれ０に符号化する場合と１に符号化する場合の２通りに場合分けして、その他の画素値については、ステップＳ１５の処理と同様に、閾値Ｔｈとの比較により、０または１に符号化して、注目画素に対する２^K種類のクラスコードを生成する。なお、当該その他の画素値の符号化については、ステップ１５の結果を流用するようにしてもよい。
【００６２】
例えば、クラスコード用タップに含まれる９画素の画素値Ｐ１乃至Ｐ９が図１４に示すような状態である場合、ステップＳ１５の処理により９ビットのクラスコード100011111が生成される。そして、ステップＳ１６の処理により、第１のテーブルから、９ビットのクラスコード100011111に対応するビット数Ｋ＝３が取得される。さらに、ステップＳ１７において、閾値Ｔｈに最も近い３個の画素値Ｐ３，Ｐ６，Ｐ８について、それぞれ０に符号化する場合と１に符号化する場合の２通りに場合分けされ、８（＝２³）種類の９ビットのクラスコード100010101，100010111，100011101，100011111，101010101，101010111，101011101，101011111が生成される。
【００６３】
また例えば、クラスコード用タップに含まれる９画素の画素値Ｐ１乃至Ｐ９が図１５に示すような状態である場合、ステップＳ１５の処理により９ビットのクラスコード101001111が生成される。そして、ステップＳ１６の処理により、第１のテーブルから、９ビットのクラスコード101001111に対応するビット数Ｋ＝２が取得される。さらに、ステップＳ１７において、閾値Ｔｈに最も近い２個の画素値Ｐ６，Ｐ８について、それぞれ０に符号化する場合と１に符号化する場合の２通りに場合分けされ、４（＝２²）種類の９ビットのクラスコード101000101，101000111，101001101，101001111が生成される。
【００６４】
以上で、クラスコード生成部１４による第１のテーブルを用いたクラスコード生成処理の説明を終了する。
【００６５】
次に、クラスコード生成部１４による第２のテーブルを用いたクラスコード生成処理について、図１６を参照して説明する。この処理は、参照フレームＦｒの各画素が、順次、注目画素に指定され、注目画素に対して実行されるものである。
【００６６】
ステップＳ２１において、クラスコード生成部１４は、注目画素を中心とする所定サイズのクラスコード用タップを設定し、クラスコード用タップに含まれる複数の画素の画素値を取得する。以下においては、図５に示されたように、クラスコード用タップのサイズを３×３画素とし、注目画素の左上、真上、右上、真左、注目画素自身、真右、左下、真下、右下の画素の画素値を、それぞれＰ１乃至Ｐ９として説明を継続する。
【００６７】
ステップＳ２２において、クラスコード生成部１４は、画素値Ｐ１乃至Ｐ９の最大値Ｐ_MAXと最小値Ｐ_MINを判定する。ステップＳ２３において、クラスコード生成部１４は、画素値Ｐ１乃至Ｐ９のダイナミックレンジＤＲ（＝｜最大値Ｐ_MAX−最小値Ｐ_MIN｜）を算出する。ステップＳ２４において、クラスコード生成部１４は、式（２）のように、画素値Ｐ１乃至Ｐ９の最小値Ｐ_MINに、ダイナミックレンジＤＲ／２を加算して閾値Ｔｈを算出する。
【００６８】
ステップＳ２５において、クラスコード生成部１４は、画素値Ｐ１乃至Ｐ９を、それぞれ閾値Ｔｈと比較し、閾値Ｔｈよりも大きい場合には１に符号化し、閾値Ｔｈよりも小さい場合には０に符号化して画素の配置の順に並べ、９ビットのクラスコードを生成する。
【００６９】
なお、ステップＳ２１乃至Ｓ２５の処理は、対象フレームＦｃの各画素に対してクラスコードを生成する処理として、現在の参照フレームＦｒが１回前のタイミングにおいて対象フレームＦｃであったときに実行されているので、その結果を流用してもよい。
【００７０】
ステップＳ２６において、クラスコード生成部１４は、テーブルメモリ１５に記憶されている第２のテーブルを参照して、ステップＳ２５の処理で生成したクラスコードのうち、０と１の２通りに場合分けするビットの画素位置を、以下の４種類の方法のうち、予め定められている方法を用いて決定する。
【００７１】
第１の方法は、各クラスコードについて、ビット反転の発生頻度が高い方の上位の所定の数カ所（例えば、２カ所）の画素位置を、０と１の２通りに場合分けするビットの画素位置に決定する方法である。
【００７２】
第２の方法は、各クラスコードについて、ビット反転の発生頻度の累計が所定の値になるまで、発生頻度が高い方から順に数カ所の画素位置を、０と１の２通りに場合分けするビットの画素位置に決定する方法である。
【００７３】
第３の方法は、第２のテーブルの全てのクラスコードにそれぞれ対応する合計２⁹×９カ所の発生頻度のうち、ビット反転の発生頻度が高い方の上位の所定の数カ所（例えば、１００カ所）の画素位置を、０と１の２通りに場合分けするビットの画素位置に決定する方法である。
【００７４】
第４の方法は、各クラスコードについて、ビット反転の発生頻度が所定の値（例えば、０．４）以上である画素位置を、０と１の２通りに場合分けするビットの画素位置に決定する方法である。
【００７５】
ステップＳ２６において、ステップＳ２５の処理で生成したクラスコードのうち、０と１の２通りに場合分けするビットの画素位置が決定された後、処理はステップＳ２７に進む。なお、ステップＳ２６において、ステップＳ２５の処理で生成したクラスコードのうち、いずれのビットも０と１の２通りに場合分けするビットに決定されなかった場合、以下に述べるステップＳ２７の処理は省略される。
【００７６】
ステップＳ２７において、クラスコード生成部１４は、画素値Ｐ１乃至Ｐ９のうち、ステップＳ２６の処理で０と１の２通りに場合分けするビットの画素位置に決定された画素位置の画素値については、それぞれ０に符号化する場合と１に符号化する場合の２通りを設定し、その他の画素値については、ステップＳ２５の処理と同様に、閾値Ｔｈとの比較により、０または１に符号化して、注目画素に対する１種類以上のクラスコードを生成する。なお、当該その他の画素値の符号化については、ステップ２５の結果を流用するようにしてもよい。
【００７７】
例えば、クラスコード用タップに含まれる９画素の画素値Ｐ１乃至Ｐ９が図１７に示すような状態である場合、ステップＳ２５の処理により９ビットのクラスコード101000110が生成される。そして、ステップＳ２６の処理により、第２のテーブルに基づいて、注目画素の真上の画素が、０と１の２通りに場合分けするビットの画素位置に決定されたとすれば、ステップＳ２７において、注目画素の真上の画素の画素値Ｐ２が、０に符号化する場合と１に符号化する場合の２通りの場合分けされて、２種類の９ビットのクラスコード101000110，111000110が生成される。
【００７８】
以上で、クラスコード生成部１４による第１のテーブルを用いたクラスコード生成処理の説明を終了する。
【００７９】
このように、クラスコード用タップに含まれる全画素をADRCによって符号化し、クラスコードを生成した際、ビット反転が生じ易いと統計的に判断できるビットについては、０に符号化した場合と１に符号化した場合の２通りに場合分けして、参照フレームＦｒの各画素に対して１種類以上のクラスコードを生成するようにしたので、クラスコードのロバスト性を向上させることができる。
【００８０】
なお、クラスコード用タップを構成する画素の数、すなわち、クラスコードのビット数は、上述した例に限るものではなく任意である。
【００８１】
以上説明したように、本実施の形態によれば、クラスコード生成部１４は、参照フレームＦｒの各画素に対し、１ビットADRCという容易な演算により、高いロバスト性を有するクラスコードを生成することができる。よって、対象フレームＦｃの画素と参照フレームＦｒの画素とを、高い精度でマッチングさせることができる。したがって、動きベクトルを精度良く検出することが可能となる。
【００８２】
また、本発明は、画像を構成する画素の画素値の他、例えば、音声データなどの任意のデータに対してクラスコードを生成する場合に適用することが可能である。
【００８３】
ところで、上述した一連の処理は、ハードウェアにより実行させることもできるが、ソフトウェアにより実行させることもできる。一連の処理をソフトウェアにより実行させる場合には、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば、図１８に示すように構成される汎用のパーソナルコンピュータなどに、記録媒体からインストールされる。
【００８４】
このパーソナルコンピュータは、CPU(Central Processing Unit)３１を内蔵している。CPU３１にはバス３４を介して、入出力インタフェース３５が接続されている。バス３４には、ROM(Read Only Memory)３２およびRAM(Random Access Memory)３３が接続されている。
【００８５】
入出力インタフェース３５には、ユーザが操作コマンドを入力するキーボード、マウスなどの入力デバイスよりなる入力部３６、操作画面や処理結果を示す画面などを表示するCRT(Cathode Ray Tube)またはLCD(Liquid Crystal Display)等よりなる出力部３７、プログラムや各種データを格納するハードディスクドライブなどよりなる記憶部３８、およびモデム、LAN（Local Area Network）アダプタなどよりなりインタネットに代表されるネットワークを介した通信処理を実行する通信部３９が接続されている。また、磁気ディスク４１、光ディスク４２、光磁気ディスク４３、および半導体メモリ４４などの記録媒体に対してデータを読み書きするドライブ４０が接続されている。
【００８６】
CPU３１に上述した一連の処理を実行させるプログラムは、磁気ディスク４１（フレキシブルディスクを含む）、光ディスク４２（CD-ROM(Compact Disc-Read Only Memory)、DVD(Digital Versatile Disc)を含む）、光磁気ディスク４３（ＭＤ(Mini Disc)を含む）、もしくは半導体メモリ４４に格納された状態でパーソナルコンピュータに供給され、ドライブ４０によって読み出されて記憶部３８に内蔵されるハードディスクドライブにインストールされている。記憶部３８にインストールされているプログラムは、入力部３６に入力されるユーザからのコマンドに対応するCPU３１の指令によって、記憶部３８からRAM３３にロードされて実行される。
【００８７】
なお、本明細書において、記録媒体に記録されるプログラムを記述するステップは、記載された順序に従って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理をも含むものである。
【００８８】
【発明の効果】
以上のように、本発明によれば、僅かな演算量のみで画像間のマッチングを行うことができる。また、本発明によれば、動きベクトル等を精度良く検出することが可能となる。
【図面の簡単な説明】
【図１】従来の動きベクトル検出装置の構成例を示すブロック図である。
【図２】対象フレームＦｃと参照フレームＦｒの対応関係を示す図である。
【図３】ブロックマッチングアルゴリズムを説明するフローチャートである。
【図４】本発明の一実施の形態である動きベクトル検出装置の構成例を示すブロック図である。
【図５】３×３画素のクラスコード用タップを示す図である。
【図６】第１のテーブルを生成する処理を説明するための図である。
【図７】第１のテーブルを生成する処理を説明するための図である。
【図８】第１のテーブルを生成する処理を説明するための図である。
【図９】第１のテーブルの一例を示す図である。
【図１０】第２のテーブルを生成する処理を説明するための図である。
【図１１】第２のテーブルの一例を示す図である。
【図１２】図４のＭＥメモリの構造を示す図である。
【図１３】図４のクラスコード生成部による第１のテーブルを利用したクラスコード生成処理を説明するフローチャートである。
【図１４】第１のテーブルを利用したクラスコード生成処理の一例を示す図である。
【図１５】第１のテーブルを利用したクラスコード生成処理の一例を示す図である。
【図１６】図４のクラスコード生成部による第２のテーブルを利用したクラスコード生成処理を説明するフローチャートである。
【図１７】第２のテーブルを利用したクラスコード生成処理の一例を示す図である。
【図１８】汎用のパーソナルコンピュータの構成例を示すブロック図である。
【符号の説明】
１１動きベクトル検出装置，１２，１３フレームメモリ，１４クラスコード生成部，１５テーブルメモリ，１６ＭＥメモリ，１７動きベクトル演算部，３１ CPU，４１磁気ディスク，４２光ディスク，４３光磁気ディスク，４４半導体メモリ[0001]
BACKGROUND OF THE INVENTION
The present invention image With regard to the processing apparatus and method, the recording medium, and the program, for example, suitable for use when matching pixels of temporally continuous image pairs and detecting a motion vector in units of pixels based on the matching result image The present invention relates to a processing apparatus and method, a recording medium, and a program.
[0002]
[Prior art]
For example, in a process of compressing and encoding an image signal such as an MPEG2 (Moving Picture Experts Group) system, an encoding process based on a correlation between two adjacent frames, so-called motion compensation interframe prediction, is used. In motion compensation inter-frame prediction, it is necessary to detect a motion vector for each pixel unit or for a pixel block unit of a predetermined size between two adjacent frames (one is described as a target frame and the other is referred to as a reference frame).
[0003]
Conventionally, a method called a block matching algorithm is used as a method for detecting a motion vector (see, for example, Patent Document 1).
[0004]
FIG. 1 shows a configuration example of a motion vector detection apparatus that detects a motion vector according to a block matching algorithm. The motion vector detection device 1 holds the input image signal for one frame and outputs it to the subsequent stage based on the frame memory 2 and 3 and the image signal input from the frame memory 2 and 3 one frame at a time. The detection unit 4 is configured to detect a motion vector in an image of an image signal input from the frame memory 2.
[0005]
The frame memory 2 holds only one frame of the input image signal, and outputs the held image signal to the frame memory 3 and the detection unit 4 when the image signal of the next frame is input. The frame memory 3 holds the image signal input from the frame memory 2 for one frame, and outputs the held image signal to the detection unit 4 when the image signal of the next frame is input.
[0006]
Accordingly, the image signal of two frames before and after is input to the detection unit 4. Hereinafter, an image of an image signal of one frame input from the frame memory 2 to the detection unit 4 is described as a target frame Fc. In addition, an image of an image signal that is input from the frame memory 3 to the detection unit 4 by one frame before the target frame Fc is described as a reference frame Fr.
[0007]
The detection unit 4 calculates a motion vector in the target frame Fc according to a block matching algorithm. The block matching algorithm will be described with reference to FIG. 2 showing the correspondence between the target frame Fc and the reference frame Fr and the flowchart of FIG. 3 showing the processing order.
[0008]
In the block matching algorithm, all the pixels in the target frame Fc are sequentially designated as the target pixel, and a reference block having a predetermined size (L × L pixel) centered on the target pixel and a search provided in the reference frame Fr. The sum of absolute differences of the pixel values of the corresponding pixel pair with the reference block (same size as the base block) moved in the area SR is calculated according to the following equation (1).
Σ _i Σ _j = | Fc (i, j) -Frn (i, j) | (1)
[0009]
Here, Fc (i, j) is the pixel value of the pixel of the base block, and Frn (i, j) is the pixel value of the pixel of the reference block with the identification number n. Also, Σ _i Means the sum operation when i is incremented by 1 from 1 to L, and Σ _j Means a sum operation when j is incremented by 1 from 1 to L.
[0010]
Then, a difference vector between the center coordinates of the reference block and the coordinates of the target pixel when the sum of absolute differences of the pixel values of the corresponding pixel pairs of the base block and the reference block is minimized is calculated as a motion vector.
[0011]
Specifically, the following processing is executed for the target pixel of the target frame Fc. In step S <b> 1, the detection unit 4 sets a search area SR having a size larger than that of the reference block around the same coordinates as the coordinates of the target pixel of the target frame Fc in the reference frame Fc.
[0012]
In step S2, the detection unit 4 initializes a variable min for storing the minimum value of the sum of absolute differences to the maximum value. For example, when the pixel value of one pixel is 8 bits and the size of the reference block is 4 × 4 pixels, the variable min is 4096 (= 2 ⁸ X16).
[0013]
In step S3, the detection unit 4 initializes the identification number n of the reference block to be moved within the search area SR to 1. In step S4, the detection unit 4 initializes a variable sum for storing a calculation result of the sum of absolute differences to zero.
[0014]
In step S5, the detection unit 4 calculates the absolute difference between pixel values of a pixel pair located at a corresponding position between the reference block of the target frame Fc and the reference block with the identification number n in the search area SR set in the reference frame Fr. The value sum is calculated and assigned to the variable sum. In step S6, the detection unit 4 compares the variable sum that is the calculation result of step S5 with the variable min, and determines whether or not the variable sum is smaller than the variable min. If it is determined that the variable sum is smaller than the variable min, the process proceeds to step S7.
[0015]
In step S7, the detection unit 4 replaces the variable min with the variable sum. Further, the detection unit 4 stores the identification number n of the current reference block as a motion vector number.
[0016]
In step S8, the detection unit 4 determines whether or not the reference block identification number n is the maximum value, that is, whether or not the reference block has been moved to the entire search area SR. If it is determined that the identification number n of the reference block is not the maximum value, the process proceeds to step S9. In step S9, the detection unit 4 increments the identification number n of the reference block by 1, returns to the process of step S4, and repeats the subsequent processes.
[0017]
If it is determined in step S6 that the variable sum is not smaller than the variable min, the process of step S7 is skipped.
[0018]
Thereafter, when it is determined in step S8 that the reference block identification number n is not the maximum value, that is, when the reference block has been moved to the entire search area SR, the process proceeds to step S10.
[0019]
In step S10, the detection unit 4 uses the difference vector between the center coordinates of the reference block corresponding to the identification number n stored as the motion vector number and the coordinates of the target pixel of the target frame Fc as the target pixel of the target frame Fc. As a motion vector. This is the end of the description of the block matching algorithm.
[0020]
[Patent Document 1]
Japanese Patent No. 3277417
[0021]
[Problems to be solved by the invention]
In the block matching algorithm described above, the amount of calculation of the sum of absolute differences of the pixel values of the pixel pair in the processing of step S5 is very large, and most of the time required for the image compression processing is spent on this calculation. There was a problem of ending up.
[0022]
The present invention has been made in view of such a situation, and an object of the present invention is to enable matching between images with only a small amount of calculation and to detect a motion vector or the like with high accuracy.
[0023]
[Means for Solving the Problems]
An image processing apparatus according to the present invention includes a setting unit that sets a block of a predetermined size including a target pixel of interest in an input image and pixels in the vicinity thereof, and a plurality of blocks included in the block set by the setting unit The pixel value of the pixel Add 1/2 of the dynamic range represented by the difference between the maximum value and the minimum value of the plurality of pixels included in the block to the minimum value, A calculating means for calculating a threshold; Bits in the class code obtained by setting a block of a predetermined size for multiple sample images, generating a class code corresponding to the set block, and comparing it with the class code for the image of the previous frame The number of bits that determine the cumulative frequency of inversions to a predetermined value, the class code, Referring to the holding means for holding the correspondence table associated with the correspondence table held by the holding means, Generated Acquisition means for acquiring the number of bits corresponding to the class code; The pixel values of a plurality of pixels included in the block set by the setting unit are each encoded as 1 when larger than the threshold calculated by the calculation unit, and are encoded as 0 when smaller than the threshold. Then, a class code for the target pixel is generated, and the pixel value of a plurality of pixels included in the block corresponding to the generated class code is acquired by the acquisition unit from the one closer to the threshold calculated by the calculation unit. Generating means for encoding pixel values of pixels having the same number of bits into two values of 0 and 1, respectively. .
[0027]
Of the present invention image A processing method includes: a setting step for setting a block of a predetermined size including a target pixel of interest in an input image and neighboring pixels; and a plurality of pixels included in the block set by the processing of the setting step Value Add 1/2 of the dynamic range represented by the difference between the maximum value and the minimum value of the plurality of pixels included in the block to the minimum value, A calculating step for calculating a threshold; Bits in the class code obtained by setting a block of a predetermined size for multiple sample images, generating a class code corresponding to the set block, and comparing it with the class code for the image of the previous frame The number of bits that determine the cumulative frequency of inversions to a predetermined value, the class code, Referring to the correspondence table with which Generated An acquisition step for acquiring the number of bits corresponding to the class code; The pixel values of a plurality of pixels included in the block set in the setting step process are each encoded as 1 when the pixel value is larger than the threshold value calculated in the calculation step process, and when the pixel value is smaller than the threshold value, Encode to 0 to generate a class code for the pixel of interest, and from among the pixel values of a plurality of pixels included in the block corresponding to the generated class code, from the one closer to the threshold calculated in the calculation step processing And a generation step of encoding pixel values of pixels of the number of bits acquired in the processing of the acquisition step in two ways of 0 and 1, respectively. .
[0028]
The recording medium program of the present invention is included in a setting step for setting a block of a predetermined size including a target pixel of interest in an input image and pixels in the vicinity thereof, and a block set by the processing of the setting step Of pixel values of multiple pixels Add 1/2 of the dynamic range represented by the difference between the maximum value and the minimum value of the plurality of pixels included in the block to the minimum value, A calculating step for calculating a threshold; Bits in the class code obtained by setting a block of a predetermined size for a plurality of sample images, generating a class code corresponding to the set block, and comparing it with the class code for the image of the previous frame The number of bits that determine the cumulative frequency of inversions to a predetermined value, the class code, Referring to the correspondence table with which Generated An acquisition step for acquiring the number of bits corresponding to the class code; The pixel values of a plurality of pixels included in the block set in the setting step process are each encoded as 1 when the pixel value is larger than the threshold value calculated in the calculation step process, and when the pixel value is smaller than the threshold value, Encode to 0 to generate a class code for the pixel of interest, and from among the pixel values of a plurality of pixels included in the block corresponding to the generated class code, from the one closer to the threshold calculated in the calculation step processing , Causing a computer to execute processing including a generation step of encoding pixel values of pixels having the number of bits acquired in the processing of the acquisition step in two ways of 0 and 1, respectively. .
[0029]
The program of the present invention includes a setting step for setting a block of a predetermined size including a target pixel of interest in an input image and pixels in the vicinity thereof, and a plurality of pixels included in the block set by the processing of the setting step Of pixel values Add 1/2 of the dynamic range represented by the difference between the maximum value and the minimum value of the plurality of pixels included in the block to the minimum value, A calculating step for calculating a threshold; Bits in the class code obtained by setting a block of a predetermined size for a plurality of sample images, generating a class code corresponding to the set block, and comparing it with the class code for the image of the previous frame The number of bits that determine the cumulative frequency of inversions to a predetermined value, the class code, Referring to the correspondence table with which Generated An acquisition step for acquiring the number of bits corresponding to the class code; The pixel values of a plurality of pixels included in the block set in the setting step process are each encoded as 1 when the pixel value is larger than the threshold value calculated in the calculation step process, and when the pixel value is smaller than the threshold value, Encode to 0 to generate a class code for the pixel of interest, and from among the pixel values of a plurality of pixels included in the block corresponding to the generated class code, from the one closer to the threshold calculated in the calculation step processing , Causing the computer to execute a process including a generation step of encoding the pixel value of the pixel having the number of bits acquired in the process of the acquisition step in two ways of 0 and 1, respectively. .
[0030]
Image processing apparatus and method of the present invention, recoding media, And attention in the program Pixel If the pixel value of a plurality of pixels included in a block of a predetermined size including is larger than the calculated threshold, it is encoded as 1, and if it is smaller than the threshold, it is encoded as 0. Pixel Class code is generated for , further, Among the pixel values of the plurality of pixels included in the block corresponding to the generated class code, the pixel values of the pixels having the obtained number of bits from the one closer to the calculated threshold value are two types, 0 and 1, respectively. Encoded .
[0031]
DETAILED DESCRIPTION OF THE INVENTION
A configuration example of a motion vector detection apparatus to which the present invention is applied will be described with reference to FIG. The motion vector detection device 11 includes frame memories 12 and 13, a class code generation unit 14, a table memory 15, an ME memory 16, and a motion vector calculation unit 17.
[0032]
The frame memory 12 holds the input image signal for only one frame. When the image signal of the next frame is input, the frame memory 12 outputs the held image signal to the frame memory 13 and the class code generation unit 14. The frame memory 13 holds the input image signal for only one frame, and outputs the held image signal to the class code generation unit 14 when the image signal of the next frame is input from the frame memory 12.
[0033]
Therefore, the class code generation unit 14 receives two frames of image signals that are temporally mixed. Hereinafter, an image of an image signal of one frame input from the frame memory 12 to the class code generation unit 14 is described as a target frame Fc. In addition, an image of an image signal that is input from the frame memory 13 to the detection unit 4 by one frame before the target frame Fc is described as a reference frame Fr.
[0034]
The class code generation unit 14 generates only one type of class code indicating a spatial feature using the pixel values of neighboring pixels for all the pixels of the target frame Fc, and sends them to the motion vector calculation unit 17. Output. Specifically, all the pixels of the target frame Fc are sequentially set as the target pixel, and a class code tap (for example, 3 × 3 pixels as shown in FIG. 5) having a predetermined size centered on the target pixel is set. Then, the pixel values of a plurality of pixels (9 pixels in the case of FIG. 5) included in the class code tap are encoded to 0 or 1 by 1-bit ADRC (Adaptive Dynamic Range Coding), respectively, and predetermined bits. A number (9 bits in the case of FIG. 5) of class codes is generated.
[0035]
Further, the class code generation unit 14 generates only one type of class code indicating a spatial feature using the pixel values of neighboring pixels for all the pixels of the reference frame Fr. However, since the reference frame Fr was the target frame Fc at the previous timing, one type of class code for each pixel of the reference frame Fr has already been generated at the previous timing. Therefore, the class code generated for each pixel of the target frame Fc at the previous timing may be held and used as the class code for each pixel of the reference frame Fr at the current timing. .
[0036]
Furthermore, the class code generation unit 14 stores predetermined bits out of a plurality of bits constituting one type of generated class code based on the first or second table stored in the table memory 15 in advance. One or more types of class codes are generated and output to the ME memory 16 for each of two cases of 0 and 1 respectively. The process for generating the class code for the pixels of the reference frame Fr will be described later.
[0037]
The table memory 15 includes a first table indicating correspondence between the class code and the number of bits to be encoded in two cases of 0 and 1, and bits at each pixel position in the class code and the class code tap. A second table indicating the frequency of occurrence of inversion is stored. It is assumed that the first and second tables are statistically generated in advance using a plurality of sample image signals and stored in the table memory 15 in advance.
[0038]
Processing for generating the first table will be described with reference to FIGS. First, a class code tap is set for each pixel of a plurality of sample image signals, and a class code is generated by ADRC. Next, all generated class codes are compared with the pixel values in the original class code tap to determine the bit inversion in the class code, and the number of bits Are counted. Based on the counting result, a histogram is generated that indicates the number of bits on the horizontal axis and the normalized frequency of bit inversion on the vertical axis, as shown in FIG.
[0039]
FIG. 6 shows polygonal lines a to d indicating the counting results for four types of class codes. Actually, for example, when the class code is 9 bits, 512 (= 2) ⁹ ) 512 polygonal lines indicating the counting results for each type of class code are obtained. Further, for each class code, the number of bits K is determined so that the cumulative frequency of occurrence of bit inversions becomes a predetermined value, and is written in the first table. As the predetermined value, a common value is used for all class codes.
[0040]
For example, when the predetermined value is set to 40%, the number of bits K = 1 is determined for the class code corresponding to the broken line d as shown in FIG. For example, when the predetermined value is determined to be 80%, the number of bits K = 3 is determined for the class code corresponding to the polygonal line d as shown in FIG.
[0041]
FIG. 9 shows an example of a first table corresponding to a 9-bit class code generated by using the class code tap of FIG. In this example, the class code 100011111 is associated with the number of bits K = 3. Further, the class code 101001111 is associated with the number of bits K = 2.
[0042]
Next, the process of generating the second table will be described with reference to FIG. First, a class code tap is set for each pixel of a plurality of sample image signals, and a class code is generated by ADRC. Next, all the generated class codes are compared with a plurality of pixel values in the original class code tap to determine the bit inversion in the class code, and the bit inversion is performed. The position of the pixel class code tap relative to the occurring bit is determined. Based on the determination result, a histogram indicating the pixel position on the class code tap on the horizontal axis and the normalized bit inversion frequency on the vertical axis is generated as shown in FIG. Is generated.
[0043]
In FIG. 10, the determination results for the three types of class codes are shown using the polygonal lines e to g, but actually, for example, when the class code is 9 bits, 512 (= 2 ⁹ ) 512 polygonal lines each indicating the determination result for each type of class code are obtained.
[0044]
FIG. 11 shows an example of the second table corresponding to the 9-bit class code generated by using the class code tap of FIG. The second table stores the frequency of occurrence of bit inversion at nine pixel positions corresponding to each class code of 9 bits.
[0045]
For example, for the class code 000000000, the frequency of bit inversion when the pixel value P1 at the upper left of the target pixel is encoded is 0, and the bit inversion at the time of encoding the pixel value P2 immediately above the target pixel is performed. The occurrence frequency is 0.05, the occurrence frequency of bit inversion when the pixel value P3 at the upper right of the pixel of interest, the pixel value P4 at the left of the pixel of interest, and the pixel value P5 of the pixel of interest is encoded is 0. The bit inversion frequency when the right pixel value P6 is encoded is 0.06, and the bit when the lower left pixel value P7, the lower right pixel value P8, and the lower right pixel value P9 of the target pixel are encoded. It indicates that the occurrence frequency of inversion is zero.
[0046]
For example, for the class code 101000110, the bit inversion frequency when the pixel value P1 at the upper left of the pixel of interest is encoded is 0.09, and the bit at the time of encoding the pixel value P2 immediately above the pixel of interest is encoded. The frequency of inversion is 0.55, the frequency of bit inversion when the pixel value P3 at the upper right of the pixel of interest is encoded is 0.04, and the bit inversion at the time of encoding the pixel value P4 immediately to the left of the pixel of interest Occurrence frequency is 0.3, bit inversion occurrence frequency is 0.1 when the pixel value P5 of the pixel of interest is encoded, and bit inversion occurrence frequency is when the pixel value P6 immediately right of the pixel of interest is encoded Is 0.09, the frequency of bit inversion occurring when the lower left pixel value P7 of the pixel of interest is encoded is 0.15, and the pixel value P8 directly below the pixel of interest and the lower right pixel value P9 are encoded. Indicates that the frequency of bit inversion is 0.08 That.
[0047]
Returning to FIG. The ME memory 16 stores one or more types of class codes for each pixel of the reference frame Fr input from the class code generation unit 14 in association with the class code and the coordinates of each pixel.
[0048]
FIG. 12 shows the structure of the ME memory 16. The ME memory 16 includes (a + 1) × (b + 1) cells indicated by feature amount addresses 0 to a and flag addresses 0 to b. Hereinafter, for example, the cell of the feature amount address 1 and the flag address 2 is described as a cell (1, 2).
[0049]
The feature amount address corresponds to the class code generated for each pixel of the reference frame Fr generated by the class code generation unit 14. For example, when the class code is 9 bits, the maximum value a = 2 of the feature amount address ⁹ It becomes.
[0050]
In the cells after the flag address 1 of the feature amount address 0, the coordinates of the pixels of the reference frame Fr in which the class code 0 is generated by the class code generation unit 14 are stored in raster order. In the cell (0, 0) of flag address 0 of feature quantity address 0, the coordinates of the pixel of class code 0 are already stored, cells after flag address 1 of feature quantity address 0, that is, used cells Is stored. For example, when class code 0 is generated for each of the three pixels of the reference frame Fr, each of the three pixels in the cell (0, 1), the cell (0, 2), and the cell (0, 3). The coordinates are stored, and 3 which is the number of cells in which the coordinates are stored is stored in the cell (0, 0).
[0051]
In the cells after the flag address 1 of the feature amount address 1, the coordinates of the pixels of the reference frame Fr in which the class code 1 is generated by the class code generation unit 14 are stored in raster order. The cell (1, 0) of flag address 0 of feature quantity address 1 stores the number of cells after flag address 1 of feature quantity address 0 in which the coordinates of the pixel of class code 1 are stored. For example, when the same class code 1 is generated for 10 pixels among the pixels of the reference frame Fr, the coordinates of 10 pixels are stored in the cell (1, 1),..., Cell (1, 10). , 10 is stored in the cell (1, 0), which is the number of cells in which coordinates are stored. Since the same applies to the cells after the feature amount address 2, description thereof will be omitted.
[0052]
Also, for example, when three types of class codes are generated for a pixel in the reference frame Fr, the coordinates of the pixels are stored in three cells of feature amount addresses corresponding to the three types of class codes. become.
[0053]
Returning to FIG. The motion vector calculation unit 17 searches the ME memory 16 for the coordinates of the reference frame Fr in which the same class code is generated for each pixel of the target frame Fc, and sets the coordinates of the target pixel among the searched pixels. The pixel having the closest coordinate is determined as the pixel of the reference frame Fr corresponding to the target pixel of the target frame Fc, and the motion vector of the target pixel is calculated.
[0055]
Next, class code generation processing using the first table by the class code generation unit 14 will be described with reference to the flowchart of FIG. In this process, each pixel of the reference frame Fr is sequentially designated as a target pixel, and is executed for the target pixel.
[0056]
In step S11, the class code generation unit 14 sets a class code tap having a predetermined size centered on the pixel of interest, and acquires pixel values of a plurality of pixels included in the class code tap. In the following, as shown in FIG. 5, the size of the class code tap is 3 × 3 pixels, the upper left, upper right, upper right, right left, the target pixel itself, right right, lower left, right below, The description will be continued assuming that the pixel values of the lower right pixel are P1 to P9, respectively.
[0057]
In step S12, the class code generation unit 14 determines the maximum value P of the pixel values P1 to P9. _MAX And the minimum value P _MIN Determine. In step S13, the class code generator 14 determines the dynamic range DR (= | maximum value P) of the pixel values P1 to P9. _MAX -Minimum value P _MIN |) Is calculated. In step S14, the class code generation unit 14 obtains the minimum value P of the pixel values P1 to P9 as in the following equation (2). _MIN In addition, the threshold value Th is calculated by adding the dynamic range DR / 2.
Th = P _MIN + DR / 2 (2)
[0058]
In step S15, the class code generation unit 14 compares the pixel values P1 to P9 with the threshold value Th, respectively, and encodes it to 1 when it is larger than the threshold value Th, and encodes it to 0 when it is smaller than the threshold value Th. The 9-bit class code is generated by arranging the pixels in the order of pixel arrangement.
[0059]
Note that the processing in steps S11 to S15 is executed when the current reference frame Fr is the target frame Fc at the previous timing as processing for generating a class code for each pixel of the target frame Fc. Therefore, the result may be used.
[0060]
In step S16, the class code generation unit 14 refers to the first table stored in the table memory 15, and acquires the number K of bits corresponding to the class code generated in the process of step S15. If the number of bits K acquired here is 0, the process of step S17 described below is omitted.
[0061]
In step S <b> 17, the class code generation unit 14, among the pixel values P <b> 1 to P <b> 9, K pixel values closest to the threshold value Th are encoded in two ways, that is, when encoded to 0 and when encoded to 1, respectively. In other cases, the other pixel values are encoded to 0 or 1 by comparison with the threshold Th as in the process of step S15, and 2 for the pixel of interest. ^K Generate a class code of a kind. Note that the result of step 15 may be used for encoding the other pixel values.
[0062]
For example, when the pixel values P1 to P9 of the nine pixels included in the class code tap are in the state as shown in FIG. 14, a 9-bit class code 100011111 is generated by the process of step S15. Then, the number of bits K = 3 corresponding to the 9-bit class code 100011111 is obtained from the first table by the process of step S16. Further, in step S17, the three pixel values P3, P6, and P8 that are closest to the threshold Th are divided into two cases, that is, when encoding to 0 and when encoding to 1, respectively, and 8 (= 2 ^Three ) Types of 9-bit class codes 100010101, 100010111, 100011101, 100011111, 101010101, 101010111, 101011101, 101011111 are generated.
[0063]
Further, for example, when the pixel values P1 to P9 of the nine pixels included in the class code tap are in the state as shown in FIG. 15, a 9-bit class code 101001111 is generated by the process of step S15. Then, the number of bits K = 2 corresponding to the 9-bit class code 101001111 is obtained from the first table by the process of step S16. Further, in step S17, the two pixel values P6 and P8 closest to the threshold Th are divided into two cases, that is, when encoding to 0 and when encoding to 1, respectively, and 4 (= 2 ² ) Types of 9-bit class codes 101000101, 101000111, 101001101, and 101001111 are generated.
[0064]
Above, description of the class code generation process using the 1st table by the class code generation part 14 is complete | finished.
[0065]
Next, class code generation processing using the second table by the class code generation unit 14 will be described with reference to FIG. In this process, each pixel of the reference frame Fr is sequentially designated as a target pixel, and is executed for the target pixel.
[0066]
In step S21, the class code generation unit 14 sets a class code tap having a predetermined size centered on the pixel of interest, and acquires pixel values of a plurality of pixels included in the class code tap. In the following, as shown in FIG. 5, the size of the class code tap is 3 × 3 pixels, the upper left, upper right, upper right, right left, the target pixel itself, right right, lower left, right below, The description will be continued assuming that the pixel values of the lower right pixel are P1 to P9, respectively.
[0067]
In step S22, the class code generation unit 14 determines the maximum value P of the pixel values P1 to P9. _MAX And the minimum value P _MIN Determine. In step S23, the class code generation unit 14 determines the dynamic range DR (= | maximum value P) of the pixel values P1 to P9. _MAX -Minimum value P _MIN |) Is calculated. In step S24, the class code generation unit 14 determines the minimum value P of the pixel values P1 to P9 as shown in Expression (2). _MIN In addition, the threshold value Th is calculated by adding the dynamic range DR / 2.
[0068]
In step S25, the class code generation unit 14 compares each of the pixel values P1 to P9 with the threshold value Th, and encodes it to 1 if it is larger than the threshold value Th, and encodes it to 0 if it is smaller than the threshold value Th. The 9-bit class code is generated by arranging the pixels in the order of pixel arrangement.
[0069]
Note that the processing in steps S21 to S25 is executed when the current reference frame Fr is the target frame Fc at the previous timing as processing for generating a class code for each pixel of the target frame Fc. Therefore, the result may be used.
[0070]
In step S26, the class code generation unit 14 refers to the second table stored in the table memory 15, and classifies the class code generated in step S25 into two types, 0 and 1. The pixel position of the bit is determined using a predetermined method among the following four methods.
[0071]
In the first method, for each class code, the pixel positions of the bits that divide the pixel positions of a predetermined number of higher places (for example, two places) with higher frequency of bit inversion into two ways of 0 and 1 It is a method to decide on.
[0072]
In the second method, for each class code, the pixel positions at several locations in order from the highest occurrence frequency are divided into two cases of 0 and 1 until the cumulative frequency of occurrence of bit inversion reaches a predetermined value. This is a method of determining the pixel position.
[0073]
The third method is a total of 2 corresponding to all the class codes in the second table. ⁹ Among the occurrence frequencies of × 9, the pixel positions of a predetermined number of higher places (for example, 100 places) with the higher bit inversion occurrence frequency are determined as the pixel positions of the bits divided into two cases of 0 and 1 It is a method to do.
[0074]
In the fourth method, for each class code, the pixel position where the occurrence frequency of bit inversion is a predetermined value (for example, 0.4) or more is determined as the pixel position of the bit divided into two cases of 0 and 1 It is a method to do.
[0075]
In step S26, after the pixel positions of the bits to be divided into two cases of 0 and 1 in the class code generated in the process of step S25 are determined, the process proceeds to step S27. In step S26, if none of the bits in the class code generated in the process of step S25 is determined to be divided into two cases of 0 and 1, the process of step S27 described below is omitted. The
[0076]
In step S27, the class code generation unit 14 determines the pixel value of the pixel position determined as the pixel position of the bit divided into two cases of 0 and 1 in the process of step S26 among the pixel values P1 to P9. Two cases of encoding to 0 and 1 to encoding are set, and the other pixel values are encoded to 0 or 1 by comparison with the threshold Th as in the process of step S25. Then, one or more types of class codes for the target pixel are generated. Note that the result of step 25 may be used for encoding the other pixel values.
[0077]
For example, when the pixel values P1 to P9 of 9 pixels included in the class code tap are in the state as shown in FIG. 17, a 9-bit class code 101000110 is generated by the process of step S25. If it is determined in step S27 that the pixel immediately above the target pixel is determined as the pixel position of the bit divided into two cases of 0 and 1, based on the second table, in step S27, Two types of 9-bit class codes 101000110 and 111000110 are generated by dividing the pixel value P2 of the pixel immediately above the target pixel into two cases, that is, encoding to 0 and encoding to 1. .
[0078]
Above, description of the class code generation process using the 1st table by the class code generation part 14 is complete | finished.
[0079]
In this way, when all the pixels included in the class code tap are encoded by ADRC and the class code is generated, bits that can be statistically determined that bit inversion is likely to occur are encoded as 0 and 1 Since one or more types of class codes are generated for each pixel of the reference frame Fr according to the two cases of encoding, the robustness of the class codes can be improved.
[0080]
The number of pixels constituting the class code tap, that is, the number of bits of the class code is not limited to the above-described example, and is arbitrary.
[0081]
As described above, according to the present embodiment, the class code generation unit 14 generates a class code having high robustness for each pixel of the reference frame Fr by an easy calculation of 1-bit ADRC. Can do. Therefore, the pixel of the target frame Fc and the pixel of the reference frame Fr can be matched with high accuracy. Therefore, it is possible to detect the motion vector with high accuracy.
[0082]
Further, the present invention can be applied to a case where a class code is generated for arbitrary data such as audio data in addition to the pixel values of pixels constituting an image.
[0083]
By the way, the series of processes described above can be executed by hardware, but can also be executed by software. When a series of processing is executed by software, a program constituting the software may execute various functions by installing a computer incorporated in dedicated hardware or various programs. For example, it is installed from a recording medium in a general-purpose personal computer configured as shown in FIG.
[0084]
This personal computer includes a CPU (Central Processing Unit) 31. An input / output interface 35 is connected to the CPU 31 via the bus 34. A ROM (Read Only Memory) 32 and a RAM (Random Access Memory) 33 are connected to the bus 34.
[0085]
The input / output interface 35 includes a keyboard for inputting an operation command by a user, an input unit 36 including an input device such as a mouse, a CRT (Cathode Ray Tube) or an LCD (Liquid Crystal) for displaying an operation screen or a screen showing a processing result. Communication processing via a network represented by the Internet, which includes an output unit 37 including a display), a storage unit 38 including a hard disk drive for storing programs and various data, and a modem and a LAN (Local Area Network) adapter. A communication unit 39 to be executed is connected. A drive 40 for reading and writing data from and to a recording medium such as the magnetic disk 41, the optical disk 42, the magneto-optical disk 43, and the semiconductor memory 44 is connected.
[0086]
Programs for causing the CPU 31 to execute the above-described series of processes are a magnetic disk 41 (including a flexible disk), an optical disk 42 (including a CD-ROM (Compact Disc-Read Only Memory), a DVD (Digital Versatile Disc)), and magneto-optical. It is supplied to a personal computer while being stored in a disk 43 (including MD (Mini Disc)) or semiconductor memory 44, read by the drive 40, and installed in a hard disk drive built in the storage unit 38. The program installed in the storage unit 38 is loaded from the storage unit 38 to the RAM 33 and executed in response to a command from the CPU 31 corresponding to a command from the user input to the input unit 36.
[0087]
In the present specification, the step of describing the program recorded in the recording medium is not limited to the processing performed in time series according to the described order, but is not necessarily performed in time series, either in parallel or individually. The process to be executed is also included.
[0088]
【The invention's effect】
As described above, according to the present invention, matching between images can be performed with only a small amount of calculation. Further, according to the present invention, it is possible to detect motion vectors and the like with high accuracy.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration example of a conventional motion vector detection device.
FIG. 2 is a diagram illustrating a correspondence relationship between a target frame Fc and a reference frame Fr.
FIG. 3 is a flowchart illustrating a block matching algorithm.
FIG. 4 is a block diagram illustrating a configuration example of a motion vector detection device according to an embodiment of the present invention.
FIG. 5 is a diagram illustrating a class code tap of 3 × 3 pixels.
FIG. 6 is a diagram for explaining processing for generating a first table;
FIG. 7 is a diagram for explaining processing for generating a first table;
FIG. 8 is a diagram for explaining processing for generating a first table;
FIG. 9 is a diagram illustrating an example of a first table.
FIG. 10 is a diagram for explaining processing for generating a second table;
FIG. 11 is a diagram illustrating an example of a second table.
12 is a diagram showing the structure of the ME memory of FIG. 4;
13 is a flowchart for explaining class code generation processing using the first table by the class code generation unit in FIG. 4; FIG.
FIG. 14 is a diagram illustrating an example of a class code generation process using a first table.
FIG. 15 is a diagram illustrating an example of a class code generation process using a first table.
16 is a flowchart for explaining class code generation processing using a second table by the class code generation unit in FIG. 4; FIG.
FIG. 17 is a diagram illustrating an example of a class code generation process using a second table.
FIG. 18 is a block diagram illustrating a configuration example of a general-purpose personal computer.
[Explanation of symbols]
11 motion vector detection device, 12, 13 frame memory, 14 class code generation unit, 15 table memory, 16 ME memory, 17 motion vector calculation unit, 31 CPU, 41 magnetic disk, 42 optical disk, 43 magneto-optical disk, 44 semiconductor memory

Claims

In an image processing apparatus that generates a class code indicating a spatial feature amount of an image and processes pixel values of pixels constituting the image based on the generated class code.
Setting means for setting a block of a predetermined size including a target pixel of interest in the input image and pixels in the vicinity thereof;
The minimum value of the pixel values of the plurality of pixels included in the block set by the setting unit is represented by a difference between the maximum value of the pixel values of the plurality of pixels included in the block and the minimum value. Calculating means for calculating a threshold by adding 1/2 of the dynamic range
The class obtained by setting a block of the predetermined size for a plurality of sample images, generating the class code corresponding to the set block, and comparing the class code with the image of the previous frame Holding means for holding a correspondence table in which the number of bits for determining the cumulative frequency of occurrence of bit inversion in the code to a predetermined value and the class code are associated;
Obtaining means for obtaining the number of bits corresponding to the generated class code with reference to the correspondence table held by the holding means ;
When the pixel values of the plurality of pixels included in the block set by the setting unit are larger than the threshold value calculated by the calculation unit, the pixel value is encoded as 1, and when the pixel value is smaller than the threshold value, Is encoded into 0, the class code for the pixel of interest is generated, and the pixel value of the plurality of pixels included in the block corresponding to the generated class code is calculated by the calculation unit Generation means for encoding pixel values of the pixels of the number of bits acquired by the acquisition means from the one closer to the threshold value in two ways of 0 and 1, respectively;
An image processing apparatus comprising:

In an image processing method of an image processing apparatus for generating a class code indicating a spatial feature amount of an image and processing a pixel value of a pixel constituting the image based on the generated class code,
A setting step for setting a block of a predetermined size including a target pixel of interest in the input image and pixels in the vicinity thereof;
The minimum value of the pixel values of the plurality of pixels included in the block set in the processing of the setting step is represented by the difference between the maximum value of the pixel values of the plurality of pixels included in the block and the minimum value. A calculation step of adding a half of the dynamic range to calculate a threshold;
The class obtained by setting a block of the predetermined size for a plurality of sample images, generating the class code corresponding to the set block, and comparing the class code with the image of the previous frame The number of bits corresponding to the generated class code with reference to a correspondence table in which the number of bits for determining the cumulative frequency of occurrence of bit inversion in the code to a predetermined value is associated with the class code An acquisition step to acquire ,
If the pixel values of the plurality of pixels included in the block set in the setting step process are larger than the threshold value calculated in the calculation step process, the pixel value is encoded as 1, If it is also smaller, it is encoded to 0 to generate the class code for the pixel of interest, and among the pixel values of the pixels included in the block corresponding to the generated class code, the calculating step A generation step of encoding pixel values of the pixels of the number of bits acquired in the processing of the acquisition step in two ways of 0 and 1, respectively, from the one closer to the threshold calculated in the processing of
An image processing method including :

To generate a class code indicating a spatial feature quantity of the image, based on the generated the class code, a program for executing processing of the pixel values of pixels constituting the image into the computer,
A setting step for setting a block of a predetermined size including a target pixel of interest in the input image and pixels in the vicinity thereof;
The minimum value of the pixel values of the plurality of pixels included in the block set in the processing of the setting step is represented by the difference between the maximum value of the pixel values of the plurality of pixels included in the block and the minimum value. A calculation step of adding a half of the dynamic range to calculate a threshold;
The class obtained by setting a block of the predetermined size for a plurality of sample images, generating the class code corresponding to the set block, and comparing the class code with the image of the previous frame The number of bits corresponding to the generated class code with reference to a correspondence table in which the number of bits for determining the cumulative frequency of occurrence of bit inversion in the code to a predetermined value is associated with the class code An acquisition step to acquire ,
If the pixel values of the plurality of pixels included in the block set in the setting step process are larger than the threshold value calculated in the calculation step process, the pixel value is encoded as 1, If it is also smaller, it is encoded to 0 to generate the class code for the pixel of interest, and among the pixel values of the pixels included in the block corresponding to the generated class code, the calculating step A generation step of encoding pixel values of the pixels of the number of bits acquired in the processing of the acquisition step in two ways of 0 and 1, respectively, from the one closer to the threshold calculated in the processing of
A recording medium on which is recorded a program that causes a computer to execute processing including the above .

To generate a class code indicating a spatial feature quantity of the image, based on the generated the class code, a program for executing processing of the pixel values of pixels constituting the image into the computer,
A setting step for setting a block of a predetermined size including a target pixel of interest in the input image and pixels in the vicinity thereof;
The minimum value of the pixel values of the plurality of pixels included in the block set in the processing of the setting step is represented by the difference between the maximum value of the pixel values of the plurality of pixels included in the block and the minimum value. A calculation step of adding a half of the dynamic range to calculate a threshold;
The class obtained by setting a block of the predetermined size for a plurality of sample images, generating the class code corresponding to the set block, and comparing the class code with the image of the previous frame The number of bits corresponding to the generated class code with reference to a correspondence table in which the number of bits for determining the cumulative frequency of occurrence of bit inversion in the code to a predetermined value is associated with the class code An acquisition step to acquire ,
If the pixel values of the plurality of pixels included in the block set in the setting step process are larger than the threshold value calculated in the calculation step process, the pixel value is encoded as 1, If it is also smaller, it is encoded to 0 to generate the class code for the pixel of interest, and among the pixel values of the pixels included in the block corresponding to the generated class code, the calculating step A generation step of encoding pixel values of the pixels of the number of bits acquired in the processing of the acquisition step in two ways of 0 and 1, respectively, from the one closer to the threshold calculated in the processing of
A program that causes a computer to execute processing including