JP3660136B2

JP3660136B2 - Encoding / decoding device

Info

Publication number: JP3660136B2
Application number: JP22970598A
Authority: JP
Inventors: 啓行 ▲高▼橋
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1998-08-14
Filing date: 1998-08-14
Publication date: 2005-06-15
Anticipated expiration: 2018-08-14
Also published as: JP2000059228A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像データなどの圧縮／伸長のための符号化復号化装置に係り、特に、ウェーブレット変換を利用する符号化復号化装置に関する。
【０００２】
【従来の技術】
データ圧縮は、大量のデータの蓄積や伝送のために非常に有用なツールである。例えば、文書のファクシミリ伝送、ワールドワイドウェブのような画像の伝送に要する時間は、圧縮を使って画像の再生に必要とされるビット数を減らすと、飛躍的に短縮される。
【０００３】
従来より、多くの様々なデータ圧縮手法が存在している。最も広く普及している圧縮方式としてＪＰＥＧ（Joint Photographic Experts Group）がある。ＪＰＥＧにおいては、入力シンボル又は輝度データは量子化されてから出力符号語へ変換される。量子化は、データの重要な情報を保存する一方、重要でない情報を除去することを目的としている。量子化に先立ちエネルギー集中をするために変換が用いられるが、ＪＰＥＧでは、この変換としてＤＣＴ（Discrete Cosine
Transform）が用いられる。ところが、ＤＣＴを用いるＪＰＥＧに対し様々な欠点が指摘されている。例えば、ブロックノイズやモスキートノイズ（蚊が飛んでいるように見えるところから、そのように呼ばれる）である。画像信号処理において、それらの欠点を解消できる、効率的かつ高精度のデータ圧縮符号化方式を追求することに関心が集中している。そのような方式の一つが、ウェーブレット（wavelet）ピラミッド処理方式である。
【０００４】
画像データのような２次元信号にウェーブレット変換を適用する場合には、入力信号を、水平方向低域通過型フィルタＨＬ（Horizontal Low）及び水平方向高域通過型フィルタＨＨ（Horizontal High）を使用して水平方向低域信号（Ｓ(smooth)係数）及び水平方向高域信号（Ｄ(detail）係数）に分離し、さらにＳ係数及びＤ係数に対して垂直方向低域通過型フィルタＶＬ（Vertical Low）及び垂直方向高域通過型フィルタＶＨ（Vertical High）をそれぞれ使用して水平方向低域−垂直方向低域信号（ＳＳ係数）、水平方向低域−垂直方向高域信号（ＳＤ係数）、水平方向高域−垂直方向低域信号（ＤＳ係数）、及び水平方向高域−垂直方向高域信号（ＤＤ係数）に分離する。以上の一連の処理をレベルと呼び、１回の水平処理と垂直処理を行った出力をレベル１の出力と呼ぶ。さらに、以上の４種類の信号を周波数帯信号と呼ぶ。レベル２以上の出力を希望するときは、この処理がＳＳ係数に対して再帰的に行われる。レベル２の出力では、ＳＳ係数と、１ＳＤ係数及び２ＳＤ係数、１ＤＳ係数及び２ＤＳ係数、１ＤＤ係数及び２ＤＤ係数、の７つの周波数帯信号が得られる。以上の説明では、まず水平方向にフィルタを適用し、次に垂直方向にフィルタを適用したが、その順序は逆でもよい。
【０００５】
以上の過程を経て得られた各周波数帯信号が符号化部で符号化される。符号化は、周波数帯信号毎にビット単位で行われる。ある周波数帯信号の、一番最初の画素の最上位ビット（ＭＳＢ）が処理の対象となる。この画素自身の状態と、周辺の画素の状態及び１つ上のレベルの状態が参照され、出力が決定される。次に２番目の画素のＭＳＢが処理の対象となるのであるが、この際は一番最初に処理された画素の状態も参照される。以下、符号化されるべき領域に対して一連の処理が終了すると、一番最初の画素のＭＳＢの次位のビット（ＭＳＢ−１）が処理の対象となる。この際は、同じビット深さの周辺画素の状態に加えＭＳＢの状態も参照される。このようにして、符号化されるべき状態に対し最下位ビット（ＬＳＢ）まで符号化が行われる。符号化データに対する復号化も、ほぼ同じ手順を経て行われる。
【０００６】
図１４に、レベル４までの処理を行う場合の従来の構成を示した。図中、１０００はウェーブレット変換部、１００１はインターフェース、１００２はフレームメモリ、１００３は符号化復号化部である。
【０００７】
ウェーブレット変換部１０００において、filter１Ｈ，filter２Ｈ，filter３Ｈ，filter４Ｈは、水平方向低域通過型フィルタＨＬ及び水平方向高域通過型フィルタＨＨを含む水平方向フィルタである。これらのフィルタ名中の数字１〜４はレベルを表し、Ｈは水平方向フィルタであることを意味する。同様に、filter１Ｖ１とfilter１Ｖ２、filter２Ｖ１とfilter２Ｖ２、filter３Ｖ１とfilter３Ｖ２、filter４Ｖ１とfilter４Ｖ２は、垂直方向低域通過型フィルタＶＬ及び垂直方向高域通過型フィルタＶＨを含む垂直方向フィルタである。これらのフィルタ名中のＶは垂直方向フィルタであることを意味し、Ｖの前の数字１〜４はレベルを表し、Ｖの後の数字１は水平方向低域信号（Ｓ係数）を入力とするフィルタであることを示し、Ｖの後の数字２は水平方向高域信号（Ｄ係数）を入力とするフィルタであることを示す。以上のフィルタはどのような構成のものでもよいが、以下の説明では、水平方向低域通過型フィルタＨＬ及び垂直方向低域通過型フィルタＶＬとして、２組のデータを用いて演算を行う２タップのフィルタを使用するものとする。また、水平方向高域通過型フィルタＨＨ及び垂直方向高域通過型フィルタＶＨとして、低域通過形フィルタＨＬまたはＶＬの出力であるＳ係数又はＤ係数のうち、現在の位置と、１つ前及び１つ後の合計３組のデータを用いて演算を行う６タップのフィルタを使用するものとする。
【０００８】
符号化復号化部１００３は、図１５に示すように、処理部１０１０、記憶部１０１１、制御部１０１２から構成される。記憶部１０１１は、対象となる周波数帯信号を保持する記憶要素Ｂ１０１４、対象となる周波数帯信号に対する、１つ上のレベルの周波数帯信号を保持する記憶要素Ａ１０１３、及び、これら記憶要素に対するアドレス生成部１０１５から構成される。記憶要素Ａ１０１３と記憶要素Ｂ１０１４は、ｎビットの深さを持ち、高速処理が必要であればビット単位での読み出し又は書き込みが可能なような構成となっており、高速処理が必要でなければ、ワード単位で読み出しを行い、対象となるビットを書き込み、再びワード単位で書き込みが行われるような構成となっている。
【０００９】
前述のフィルタを用いた場合のウェーブレット変換部１０００での処理の例を図１６に示す。但し、この図におけるデータのマッピングは演算の方法を説明するためのものであり、実際のメモリへのマッピングは例えば図１８から図２１に示すようになることに注意されたい。
【００１０】
図１６の（ａ）は水平方向フィルタの処理を説明するもので、［００］は０ライン目の０画素目のデータを意味し、［１２］は１ライン目の２画素目のデータを意味する（このようにライン、画素とも０番目から数えるものとする）。水平方向低域通過型フィルタＨＬの０画素目の出力［Ｓ００］は、データ［００］及びデータ［０１］から求められ、また、１画素目の出力［Ｓ０１］はデータ［０２］及びデータ［０３］から求められる。これに対し、水平方向高域通過型フィルタＨＨの０画素目の出力［Ｄ００］は、データ［００］の２つ前及び１つ前のデータ（実在しない）と、データ［００］と、データ［０１］と、データ［０２］と、データ［０３］とから求められる。ここで、実在しないデータ［００］の２つ前と１つ前のデータを得るため、ミラーと呼ばれる処理を施す。具体的には、データを鏡像関係で折り返す処理を行う。これにより、２つ前と１つ前のデータはデータ［０１］とデータ［００］となる。このようにして、［Ｄ００］は６画素のデータから計算される。
【００１１】
図１６の（ｂ）は垂直方向フィルタの処理を説明している。この処理は、水平方向フィルタ処理によるＳ係数及びＤ係数を用いて垂直方向に行われる。実在しない係数は、水平方向フィルタの処理の場合と同様にミラー処理が施される。
【００１２】
図１７はフレームメモリ１００２にラスタ順に格納されたイメージデータを示す。図１８乃至図２１に、ウェーブレット処理のレベル２までの演算結果の格納方法を例示する。ウェーブレット変換部１０００は、最初に、図１７に示すように格納されたイメージデータをフレームメモリ１００２から読み出して水平処理を行い、その結果を再びフレームメモリ１００２に書き込む。この書き込みの際に、未処理のデータに上書きしてしまわないように、図１８に示すようなマッピングでＳ係数及びＤ係数を書き込んでいく。図１８において、［１Ｓ００］はレベル１のアドレス００のＳ係数を意味する。図１９垂直処理を行った後の各係数を書き込む際のマッピングの例を示す。ここまでがレベル１の各係数の格納方法である。図２０はレベル２の水平方向の各係数の格納方法の例を示す。レベル２の処理は１ＳＳ係数に対してのみ行われるため、網掛けされた部分のデータは用いられないことに注意されたい。ついで、図２１に示すようなマッピングで、レベル２の各係数が格納され、レベル２の処理が終了する。以上の処理がレベル４まで繰り返される。
【００１３】
図２２は、図１４に示す構成でのタイミングチャートである。ただし、このタイミングチャートは処理手順の説明のために用いるものであり、横軸（時間軸）のスケールはリニアでないことに注意されたい。また、以下の説明では、画素数もしくはライン数を０画素目もしくは０ライン目、というように０から数える。入力されるイメージデータ（ラスタデータ）は３２画素×３２ライン（０画素目から３１画素まで、０ライン目から３１ライン目）であり、１つのデータの区切り（×＝×）が１ラインに相当するものとする。
【００１４】
時刻ｔ０から、０ライン目のデータが０画素目から順次入力され、１画素目が入力されるとfilter１Ｈより０画素目のデータ［１Ｓ００］が出力される。ついでデータ［１Ｓ０１］が出力されると、Ｄ係数の計算に必要となる３組のＳ係数（［１Ｓ００］，［１Ｓ００］，［１Ｓ０１］）が揃い（１つ前のデータはミラー処理により得られる）、Ｄ係数［１Ｄ００］が出力される。これが１ライン分、繰り返される。なお、タイミングチャート上では１ラインの時間単位で示されているが、拡大すれば画素単位でのずれが生じていることに注意されたい。
【００１５】
時刻ｔ１から１ライン目のデータの入力が始まり、filter１Ｈより［１Ｓ１０］、［１Ｄ１０］、とＳ係数及びＤ係数が順次出力される。［１Ｓ１０］が出力された時点で垂直方向フィルタのfilter１Ｖ１より［１ＳＳ００］が、filter１Ｖ２より［１ＤＳ００］が出力される。次に、［１Ｓ１１］が出力された時点でfilter１Ｖ１及びfilter１Ｖ２においてＤ係数の計算に必要な３組のデータが揃う。すなわち、filter１Ｖ１においては［１Ｓ１０］，［１Ｓ１０］，［１Ｓ１１］、filter１Ｖ２においては［１Ｄ１０］，［１Ｄ１０］，［１Ｄ１１］が揃い（１つ前のデータはミラー処理により得られる）、レベル１の出力データ［１ＳＳ００］，［１ＳＤ００］，［１ＤＳ００］，［１ＤＤ００］が得られる。これが１ライン分繰り返される。
【００１６】
時刻ｔ２で、２Ｖの１ライン目の入力が開始されて２Ｖの処理が始まる。以下、同様のタイミング関係で時刻ｔ９まで処理が繰り返され、レベル４までの各周波数帯信号が出力される。これら各データはフレームメモリ１００２に書き込まれる。
【００１７】
フレームメモリ１００２に書き込まれた各周波数帯信号は、符号化復号化部１００３によって符号化される。符号化復号化部１００３では、画像信号の隣接画素の相関、特に同一ビットプレーン内での相関が高いという特性を活かして圧縮率を上げている。このため、符号化の際には、あるまとまった領域のデータをビット単位（ある画素のデータの、任意の１ビット）のデータを扱う必要がある。画像データのサイズは通常、非常に大きい（数ＭＢに及ぶ場合がある）ので、ウェーブレット処理終了後のデータは一旦フレームメモリ１００２に書き込む必要がある。ところが、フレームメモリ１００２ではビット単位での読み出し又は書き込みは不可能である。そこで、符号化復号化部１００３は、内部に用意されたビット単位での処理が可能な記憶要素１０１３，１０１４にフレームメモリ１００２からデータをロードして符号化処理を行い、符号化データｃｏｄｅを出力する。復号化は以上に述べた動作のほぼ逆順の動作で行われる。
【００１８】
次に、ウェーブレット変換のための処理時間について説明する。ここでは、ウェーブレット変換部１０００により生成される各周波数帯信号のストレージ、すなわち図１４中のフレームメモリ１００２として、一般的な半導体メモリが用いられるものとする。
【００１９】
図２２のタイミングチャートを用いて説明したように、各周波数帯信号は同時刻にパラレルに出力されるため、外部メモリへの書き込みもパラレルに行われなければならないが、通常用いられる半導体メモリでは１時刻に読み出しまたは書き込みをすることができるのは１データだけである。図２２の左下のｒａｎｇｅは、時刻ｔ０からｔ９に対応する、１Ｈ，１Ｖ，．．．，４Ｖの各処理が占める処理時間の範囲を←→で示したものである。ｒａｎｇｅの下のｒ／ｗ cyclesは、ｒａｎｇｅ内（←→）の範囲での書き込み／読み出しの合計数であるが、異なるレベルが同時に処理されている範囲での回数は、それら各レベルに関する回数の合計で示されている。図２２の右側に示した数値は、各レベルの水平処理もしくは垂直処理に要するメモリアクセスの回数（書き込みと読み出しの合計数）である。この数値はウェーブレット逆変換時も同じである。さて、メモリアクセスの回数についてであるが、各レベルにおいて、水平処理、垂直処理のいずれも必ず全データが１回読み出され、全データがフィルタ出力データで書き換えられるから、全画素数の２倍の書き込み／書き込み回数が必要となる。
【００２０】
以上に述べたような、本発明に関連する符号化復号化装置と、ウェーブレット変換装置あるいはウェーブレット変換フィルタに関する、より詳細な情報は、特開平８−１３９９３５号公報などを参照されたい。また、符号化部については、特開平９−１２１１６８号公報などを参照されたい。さらに、類似のウェーブレット変換不に関する従来技術については、特開平３−２７６８７号公報、特開平５−１６７９９７号公報、あるいは特開平５−１８３３８６号公報などを参照されたい。
【００２１】
【発明が解決しようとする課題】
前述のように、ウェーブレット変換の出力はストレージに一旦貯える必要があり、データを単に入力するのに要する時間に比べ数倍の処理時間がかかるという問題があった。前述の従来技術の場合、入力されるイメージデータのサイズを３２画素×３２ライン、レベル数を４とした場合、イメージデータの入力に必要なサイクル数が１０２４＝３２×３２であるのに対し、必要な処理時間は５倍以上の５４４０サイクルとなる。入力データのサイズが増加すれば、処理時間はさらに大幅に増大することは明かである。例えば、６４画素×６４ラインの場合は、図２２に点線で示すように、１Ｈの処理が時刻ｔ１０まで行われる結果、パラレルに出力される区間が増加するため、処理時間は大幅に増大する。レベル数が増えた場合も同様に処理時間が大幅に増大する。
【００２２】
また、各レベルの各周波数帯信号が同じ時刻に出力されるので、パイプライン処理が必要であった。すなわち、フィルタ毎にデータが入力されるタイミングが異なっているため、各フィルタに、それが使用される場所に応じ個別的に設計したコントローラを内蔵させる必要があった。また、これらのコントローラはただ一つの条件の画素数とレベルの組合せにしか対応させることができず、画素数またはレベルの一方又は両方が変更された場合に対応が困難であるという問題があった。
【００２３】
また、ウェーブレット変換部にその処理のための記憶部を、符号化復号化部にその処理のための記憶部を、それぞれ別々に備えているため、符号化復号化装置をワンチップ化する場合、記憶部の占めるチップ面積が大きくなってしまうという問題があった。
【００２４】
本発明は、前述の問題点に鑑みなされたものであり、その主たる目的は、パイプライン処理を必要とせずに高速処理の可能な符号化復号化装置を提供することにある。本発明のもう一つの目的は、記憶部の占めるチップ面積が少なくてすむ高速処理が可能な符号化復号化装置を提供することにある。
【００２６】
【課題を解決するための手段】
請求項１の符号化復号化装置は、ウェーブレット変換による周波数帯信号の符号化又はその符号化データの復号化のための符号化復号化部と、それぞれが独立した記憶要素からなる記憶部と、該記憶部への書き込みデータの一部を一時的に保存するためのラインメモリと、該記憶部に対する行方向へのアクセスを制御するための行方向アドレスデコード／データ選択部と、該記憶部に対する列方向へのアクセスを制御するための列方向アドレスデコード／データ選択部と、該行方向アドレスデコード／データ選択部を介して入力されるデータに対しウェーブレット変換又は逆ウェーブレット変換の水平処理を施すための行方向フィルタ部と、該列方向アドレスデコード／データ選択部を介し入力されるデータに対しウェーブレット変換又は逆ウェーブレット変換の垂直処理を施すための列方向フィルタ部と、該行方向アドレスデコード／データ選択部及び該列方向アドレスデコード／データ選択部を介して該記憶部に対するデータの書き込み及び読み出しと該行方向フィルタ部及び該列方向フィルタ部に対するデータの入出力を制御し、該符号化復号化部に対するデータの入出力、該ラインメモリに対するデータの書き込み及び読み出し、並びに外部メモリに対するデータの書き込み及び読み出しとを制御する主制御部とを具備し、該記憶部が、ウェーブレット変換処理又は逆ウェーブレット変換処理のためのバッファ領域及び符号化処理又は復号化処理のためのバッファ領域として共用される構成とされる。
【００２７】
請求項２の符号化復号化装置の特徴は、請求項１の符号化復号化装置の構成において、該外部メモリからのデータ入力と並行してウェーブレット変換の水平処理が実行されることである。
【００２８】
請求項３の符号化復号化装置の特徴は、請求項１又は２の符号化復号化装置の構成において、該ラインメモリに書き込まれるデータが処理の中間データであることである。
【００２９】
請求項４の符号化復号化装置の特徴は、請求項１又は２の符号化復号化装置の構成において、該記憶部のオーバーラップ領域に処理の中間データが書き込まれることである。
【００３０】
請求項５の符号化復号化装置の特徴は、請求項１乃至４の各項の符号化復号化装置の構成において、該記憶部が、要求される最大レベルまでの全レベルのウェーブレット変換の処理を内部で連続して実行するために必要な記憶容量を持つことである。
【００３１】
【発明の実施の形態】
図１は、本発明の第１の実施例を示すブロック図である。本実施例の符号化復号化装置は、主制御部１００、ウエーブレット変換又は逆ウェーブレット変換のためのウェーブレット変換部１０１、ウェーブレット変換による周波数帯信号の符号化又はその符号化データの復号化のための符号化復号化部１０９、記憶部１０２、ラインメモリ１０４、行方向制御部１０６及び列方向制御部１０８からなる。１１０は本符号化復号化装置と接続される外部のフレームメモリである。行方向制御部１０６は記憶部１０２に対する行方向（ｘ方向）へのアクセスを制御し、列方向制御部１０８は記憶部１０２に対する列方向（ｙ方向）へのアクセスを制御するものである。主制御部１００は、行方向制御部１０６及び列方向制御部１０８を介し記憶部１０２に対するデータの書き込み及び読み出しを制御し、ラインメモリ１０４に対するデータの書き込み及び読み出しを制御し、フレームメモリ１１０に対するデータの書き込み及び読み出しを制御し、また、ウェーブレット変換部１０１及び符号化復号化部１０９に対するデータの入力及び出力を制御する。
【００３２】
記憶部１０２は、ウェーブレット変換部１０１及び符号化復号化部１０９の処理のためのバッファ領域として共用されるものである。したがって、ウェーブレット変換部１０１及び符号化復号化部１０９の内部には、従来のような記憶部を設ける必要はない。本実施例においては、記憶部１０２は、行方向（ｘ方向）及び列方向（ｙ方向）へのデータシフトが可能なシフトレジスタからなるものである。ここでは、記憶部１０２の大きさは２０行×２０列（２０画素×２０ライン）であるとする。図２に、記憶部１０２の記憶セルの構成を示す。各記憶セルはｎビットのビット深さを有する。
【００３３】
以下、図１、図２、図３乃至図８を参照し、本実施例について詳細に説明する。なお、図３は記憶部１０２に対するデータの書き込み方の一例を示し、図４は記憶部１０２における水平処理後のデータの書き込み方の一例を示す。図５はフレームメモリ１１０に対する記憶部１０２及びラインメモリ１０４の割り当て方（タイリング）の一例を示し、図６は記憶部１０２及びラインメモリ１０４に関係したデータの移動や複写などを説明するための図である。図７は、レベル２まで終了した時点の記憶部１０２におけるマッピングの一例を示す図である。図８は、１ブロックに対するレベル２までの変換処理動作のタイミングチャートである。なお、各図の内容は例示であって、本発明の主旨から逸脱しない限り、さまざまな形態をとり得ることに注意されたい。
【００３４】
図２において、点線で囲んだ各部分が記憶部１０２の１つの記憶セルを表す。図示のように、記憶部１０２の各記憶セルは、ｎビットのビット深さを有するシフトレジスタＳＲとマルチプレクサＭＵＸからなる。各記憶セルのシフトレジスタＳＲのデータ入力には、マルチプレクサＭＵＸを介して、行方向の前段（右側）の記憶セルの出力データ又は列方向の前段（下側）の記憶セルの出力データが入力される。この入力データの切り替え、すなわち行方向へのデータシフトか列方向のデータシフトかの切り替えは、マルチプレクサＭＵＸへの制御入力ｈｖｂ（horizontal/vertical bar）によって制御される。ここまでの説明から明らかなように、行方向（ｘ方向）のデータシフトは右から左への向きに、列方向（ｙ方向）のデータシフトは下から上への向きに、それぞれ行われることになる。
【００３５】
ウェーブレット変換を行う場合、まず、フレームメモリ１１０に記憶されているイメージデータの０ライン目のデータが主制御部１００の制御によって読み出され、これが行方向制御部１０６を介して記憶部１０２の一番下の第ｊ行（図４参照）の最前段（右端）に入力され左方向へ順次シフトされる。ただし、０画素目と１画素目に対してはミラー処理が必要となるため、記憶部１０２にも最初からその順で書き込まれる。図３中の網掛け部分は、ミラー処理された画素のデータを示す。このようにして図３（ｂ）に示すように０ライン目のデータが全て書き込まれると、このデータは一つ上の行へシフトされる。次に、１ライン目のデータが同様にフレームメモリ１００から読み出され、行方向制御部１０６により記憶部１０２の一番下の行に入力され順次左へシフトされる。また、データ入力と並行してウェーブレット変換部１０１により水平処理が実行され、０ライン目のデータに対して計算されたＳ係数、Ｄ係数が、行方向制御部１０６を介し、記憶部１０２の０ライン目データが書き込まれている行に入力され順次左へシフトされる。図３（ｃ）は、このような０ライン目の水平処理と１ライン目のデータの入力処理の途中の状態を示す。同様のイメージデータの入力と水平処理が並行して繰り返し実行されることにより、最終的に、水平処理の結果が記憶部１０２に図４に示すようにマッピングされる。なお、ラインメモリ１０４に対するデータの書き込み又は読み出しも行われるが、これについては後述する。
【００３６】
図４において、記憶部１０２の第０行と第１行には、第３行と第２行と同じデータが書き込まれていることに注意されたい。これは垂直方向のミラー処理であり、具体的には、第ｊ行まで処理が終わった段階で、行方向制御部１０６の制御により第３行のデータが第０行に書き込まれ、また第２行のデータが第１行に書き込まれる。また、各行の第ｉ列と第ｊ列には第ｇ列と第ｈ列と同じデータが書き込まれていることに注意されたい。これは、水平処理の過程で行方向制御部１０６の制御によって行われるが、その理由については後述する。
【００３７】
さて、図４のようにマッピングされた記憶部１０２上のＳ係数、Ｄ係数に対して、垂直処理が施される。ただし、第０列と第１列は処理の対象外であることに注意されたい。垂直処理の場合、列方向制御部１０８の制御により、第２列のデータが列方向（上向き）へ順次シフトされ、シフトアウトされたデータが主制御部１００を経由してウェーブレット変換部１０１へ入力されてＳＳ係数、ＳＤ係数が計算され、それら係数が列方向制御部１０８の制御により記憶部１０２の第２列に順次入力されシフトされていく。第３列のデータも同様に列方向にシフトされ、シフトアウトされたデータがウェーブレット変換部１０１に入力されてＤＳ係数、ＤＤ係数が計算され、これが第３列に入力されシフトされる。同様の処理が第ｈ列まで繰り返されることにより、レベル１のウェーブレット変換が終わる。
【００３８】
レベル２のウェーブレット変換は、記憶部１０２上のレベル１のＳＳ係数（１ＳＳ）のみを対象として行われる。すなわち、行方向制御部１０６の制御により各行のデータが行方向にシフトされ、シフトアウトされた１ＳＳ係数がウェーブレット変換部１０１へ送られて２Ｓ係数，２Ｄ係数が計算され、この係数は、シフトアウトされたレベル１のデータ（１ＳＤ係数、１ＤＳ係数、１ＤＤ係数）とともに、図２０に示したようなマッピングとなるような順番で当該行に入力され順次シフトされる。この繰り返しによりレベル２の水平処理が終わる。なお、第０行と第１行は処理の対象外である。また、各行の第ｉ列と第ｊ列のデータも処理の対象外である。次にレベル２の垂直処理が行われる。第２列から第ｇ列までの各列について、列方向制御部１０８の制御によりデータが列方向に順次シフトされ、シフトアウトされた２Ｓ係数又は２Ｄ係数がウェーブレット変換部１０１に入力されて２ＳＳ係数と２ＳＤ係数、又は、２ＤＳ係数と２ＤＤ係数が計算され、これら係数はシフトアウトされたデータとともに、図２１に示したようなマッピングとなるような順番で当該列に入力され順次シフトされる。この処理が繰り返され、レベル２の処理が終わる。かくして、レベル２までのウェーブレット変換の結果は記憶部１０２上に図７に示すようにマッピングされる。この１６画素×１６ラインのデータはフレームメモリ１１０へ書き出される。レベル３以上の変換を行う場合には、フレームメモリ１１０上のＳＳ３係数だけが記憶部１０２に読み込まれ、同様の処理が行われる。
【００３９】
図５は、フレームメモリ１１０のサイズが記憶部１０２のサイズより大きい場合のタイリング方法を示している。Ｂ００はブロック（水平０、垂直０）を意味する。ここでは、記憶部１０２のサイズは２０画素×２０ラインとしているので、図５中のｘ0は１５画素目、２ｘ0は３１画素目であり（０から数えている）、ｙ0は１５ライン目、２ｙ0は３１ライン目である（ｘ0＝ｙ0＝１５、０から数えている）。したがって、３２画素×３２ライン（０画素目から３１画素目まで、０ライン目から３１ライン目まで）のイメージを処理する場合、図示のようにＢ００，Ｂ０１，Ｂ１０，Ｂ１１の４つのブロックに分割して処理する必要がある（この図は従来技術との比較を行うための図である）。次に、図６を参照して、各ブロックの処理について説明する。
＜ブロックＢ００の処理：図６（ａ）＞
このブロックＢ００については、０画素目から１７画素目まで、０ライン目から１７ライン目までの１８画素×１８ラインのイメージデータが記憶部１０２に読み込まれる。図６（ａ）に示す▲１▼の部分はデータが実在しないので、そのデータはミラー処理によって補われる。下隣りのブロックＢ１０の処理で必要であるがブロックＢ００の変換データによって上書きされてしまう▲２▼の部分（図４の第ｇ行、第ｈ行に対応）のデータは、予めラインメモリ１０４にコピーされる。また、ブロックＢ００の変換データによって上書きされてしまう▲３▼の部分（図４の第ｇ列、第ｈ列に対応）のデータは、下隣りのブロックＢ０１の処理の際に使用できるようにするため記憶部１０２の右端（図４の第ｉ列と第ｊ列）に予めコピーされる。ブロックＢ００に対する変換が行われ、その変換データ（図７に示す１６画素×１６ラインのデータ）がフレームメモリ１１０の０画素目から１５画素目まで、０ライン目から１５ライン目までの領域に書き込まれる。次にブロックＢ０１が処理される。
＜ブロックＢ０１の処理：図６（ｂ）＞
このブロックＢ０１については、１６画素目から３３画素目まで、０ライン目から１７ライン目までの１８画素×１８ラインのイメージデータが記憶部１０２に読み込まれる。この際、記憶部１０２の右端にあった▲３▼の部分のデータは左端（図４の第０列、第１列）に移動させられる。データが存在しない▲１▼の部分のデータはミラー処理により補われる。下隣りのブロックＢ１１の処理で必要であるがブロックＢ０１の変換データで上書きされてしまう▲４▼の部分（図４の第ｇ行、第ｈ行に対応）のデータは、ラインメモリ１０４に予めコピーされる。ブロックＢ０１に対する変換データは、フレームメモリ１１０の１６画素目から３１画素目まで、０ライン目から１５ライン目までの領域に書き込まれる。ブロックＢ１０の処理に進む。
＜ブロックＢ１０：図６（ｃ）＞
このブロックＢ１０については、０画素目から１７画素目まで、１６ライン目から３３ライン目までのイメージデータが記憶部１０２に読み込まれる。データが存在しない▲１▼の部分のデータはミラー処理によって補われる。▲２▼の部分は上隣りのブロックＢ００の変換データで上書きされているので、ラインメモリ１０４にセーブされていた、その部分のデータが書き込まれる。▲６▼の部分のデータは右隣りのブロックＢ１１の処理で使用できるようにするため記憶部１０２の右端にコピーされる。ブロックＢ１０の変換データで上書きされる▲５▼の部分のデータは、ラインメモリ１０４にコピーされる。このブロックＢ１０の変換データはフレームメモリ１１０の０画素目から１５画素目まで、１６ライン目から３１ライン目までの領域に書き込まれる。次にブロックＢ１１の処理に進む。
＜ブロックＢ１１の処理：図６（ｄ）＞
フレームメモリ１１０の１６画素目から３３画素目まで、１６ライン目から３３ライン目までのイメージデータが記憶部１０２に書き込まれる。この際、記憶部１０２の右端にあった▲６▼の部分のデータは左端に移動させられる。▲４▼の部部にはラインメモリ１０４にセーブされていたデータが書き込まれる。▲７▼の部分のデータはラインメモリ１０４にコピーされる。このブロックＢ１１に対する変換データは、フレームメモリ１１０の１６画素目から３１画素目まで、１６ライン目から３１ライン目までの領域に書き込まれる。
【００４０】
図８は、上に述べた各ブロックに対し、レベル２までのウェーブレット変換を行う場合のタイミングチャートである。時刻ｔ０から時刻ｔ１までがフレームメモリ１１０から記憶部１０２へのデータの読み込みとレベル１の水平処理が行われる期間であり、時刻ｔ１から時刻ｔ４までが内部でレベル１の垂直処理からレベル２の垂直処理までが行われる期間であり、時刻ｔ４から時刻ｔ５までが変換データをフレームメモリ１１０に書き出す期間である。フレームメモリ１１０に対するデータの読み出しと書き込みのサイクル数は、それぞれ４００サイクル
（＝２０×２０）と２５６サイクル（＝１６×１６）である。フレームメモリ１１０のアクセスを伴わない内部動作はフレームメモリ・アクセスに比べ遥かに高速化できるが、ここでは内部動作をフレームメモリ・アクセスと同じ動作速度であると仮定して時刻ｔ１〜時刻ｔ４までの期間のサイクル数を計算すると８００サイクルとなる。したがって、時刻ｔ０から時刻ｔ５までの総サイクル数は１４５６サイクルとなる。
【００４１】
同じ処理がＢ００，Ｂ０１，Ｂ１０，Ｂ１１の４ブロックに対して繰り返され、レベル２の変換データが得られる。レベル４の変換データを得る場合には、Ｂ００〜Ｂ１１の４ブロックに含まれるＳＳ係数（２ＳＳ）を集め１つのブロックとして処理される。この処理では９画素、９ライン（０から数えて）となるので、サイクル数はブロックＢ００の処理の４分の１、すなわち３６４サイクルとなる。したがって、レベル４までのウェーブレット変換処理にかかる総時間は
（４００＋２５６＋８００）＊（４＋１／４）＝６１８８
となる。しかし、これは内部動作の速度をフレームメモリ・アクセスと同じと仮定した数値であって、実際には内部動作をフレームメモリ・アクセスに比べ数倍高速化することは容易であるから、総時間はさらに短縮可能である。例えば、内部動作の速度をフレームメモリ・アクセスの２倍と仮定すると、総時間は
（４００＋２５６＋８００／２）＊（４＋１／４）＝４４８８
サイクルまで減少する。内部動作の速度をフレームメモリ・アクセスの４倍と仮定すると、総時間は
（４００＋２５６＋８００／４）＊（４＋１／４）＝３６３８
サイクルまで減少する。このように、本発明によれば、従来技術において同様の処理を行う場合の総時間５４４０サイクルより、総時間を短縮できることは明かである。
【００４２】
以上のようして１フレームの全体（又は一部）のウェーブレット変換が終了すると、フレームメモリ１１０上の周波数帯信号データが主記憶部１００の制御により読み出され、行方向制御部１０６及び列方向制御部１０８を介して記憶部１０２に書き込まれる。そして、符号化復号化部１０９は、記憶部１０２上の周波数帯信号データに対する符号化処理を行い、符号化データｃｏｄｅを出力する。すなわち、この時には記憶部１０２は符号化処理のためのバッファ領域として利用され、符号化復号化部１０９と記憶部１０２との間のデータ入出力は主制御部１００と行，列方向制御部１０６，１０７によって制御される。
【００４３】
符号化について、より詳しく説明すれば、ＳＳ係数を除いた各レベルの各種類の周波数帯信号毎に、例えば、４ＤＳ，４ＳＤ，４ＤＤ，．．．毎に、ビットプレーン（同じビット深さの一の２次元のビット平面）単位で、そのＭＳＢ（最上位ビット）から下位ビットへと順に処理される。符号化の処理は、ビットプレーンの２（ｘ方向）×８（ｙ方向）画素の単位（これは２×８の大きさのデータが存在する場合。それより小さい場合は、その大きさの単位）で行われる。実際に処理されるのは上述の大きさの単位毎であるが、その周辺も参照するため、周辺を含めた領域のデータ、例えば４×１０画素のデータがフレームメモリ１１０から記憶部１０２に読み込まれる。さらに、同じ種類の１つ上のレベルの周波数帯信号が存在する場合場合は、それも参照されるので、同様にフレームメモリ１１０から読み込まれる。最上位のビットプレーンの処理が終了すると、１つ下位のビットプレーンが同様に処理される。これを繰り返すことにより、１つのレベルの１種類の周波数帯信号の処理が終了する。これが全レベルの全種類の周波数帯信号に対して行われ、符号化を終了する。
【００４４】
以上は符号化の場合の説明であったが、復号化は符号化と逆の手順によって行われる。すなわち、外部より入力する符号化データｃｏｄｅは符号化復号化部１０９によって復号化され、周波数帯信号データが記憶部１０２上に復元される。この時には、記憶部１０２が復号化処理のためのバッファ領域として使用されるわけである。復元された周波数帯信号データはフレームメモリ１１０に書き出される。
【００４５】
復号化をより詳しく説明すれば、符号化データｃｏｄｅから、あるレベルのある種類の周波数帯信号、例えば４ＤＤ係数が、ＭＳＢからビット単位で復号化され、ビットプレーンが再生される。復号化もビットプレーンの２（ｘ方向）×８（ｙ方向）画素の単位で（２×８の大きさのデータが存在する場合。それより小さい場合は、その大きさの単位で）行われる。同じ種類の１つ上のレベルの周波数帯信号が存在する場合は、それも参照される。当該ビットプレーンの処理が終了すると、１つ下位のビット深さのビットプレーンが処理される。同様の処理が全レベルの全種類の周波数帯信号に対して行われる。
【００４６】
このようにして、１フレーム全体（又は一部）の周波数帯信号データがフレームメモリ１１０上に復元されると、周波数帯信号データが１タイル分、記憶部１０２へ読み込まれ、ウェーブレット変換部１０１を利用して逆ウェーブレット変換処理が行われる。逆ウェーブレット変換処理はレベル４から行われ、また、各レベルの垂直処理、水平処理がこの順で行われる。最初にＳＳ係数と４ＳＤ，４ＤＳ，４ＤＤの各係数から３ＳＳ係数が再生され、これがＳＳ係数と４ＳＤ，４ＤＳ，４ＤＤの各係数に上書きされる。再生された３ＳＳ係数と３ＳＤ，３ＤＳ，３ＤＤの各係数から２ＳＳ係数が再生され、これが３ＳＳ，３ＳＤ，３ＤＳ，３ＤＤの各係数に上書きされる。同様にウェーブレット変換とは逆の手順が繰り返され、最終的にイメージデータが復元され、これがフレームメモリ１１０に書き出され、当該タイルの周波数帯信号データに上書きされる。
【００４７】
なお、外部のフレームメモリ１１０を除き、符号化復号化装置をワンチップ化する場合、記憶要素の占めるチップ面積の増加を押さえることが重要である。本実施例によれば、記憶部２０２をウェーブレット変換部１０１と符号化復号化部１０９のバッファ領域として共用するため、それぞれに別々の記憶部を用意する構成に比べ、記憶要素の占めるチップ面積を減らし、かつ、前述のように高速処理が可能となる。
【００４８】
図９は、本発明の第２の実施例を示すブロック図である。本実施例の符号化復号化装置は、主制御部２００、記憶部２０２、ラインメモリ２０４、ウェーブレット変換又は逆ウェーブレット変換の水平処理のための行方向フィルタ部２０６、垂直処理のための列方向フィルタ部２０７、行方向アドレスデコード／データ選択部２０８、列方向アドレスデコード／データ選択部２０９、及び、符号化復号化部２１１からなる。２１０は本符号化復号化装置に接続される外部のフレームメモリである。行方向アドレスデコード／データ選択部２０８は記憶部２０２に対する行方向へのアクセスを制御するものであり、列方向アドレスデコード／データ選択部２０９は記憶部２０２に対する列方向へのアクセスを制御するためのものである。主制御部２００は、行方向アドレスデコード／データ選択部２０８及び列方向アドレスデコード／データ選択部２０９を介し記憶部２００をアクセスし、記憶部２０２に対するデータの書き込み及び読み出しとフィルタ部２０６，２０７に対するデータの入出力を制御し、また、ラインメモリ２０４に対するデータの書き込み及び読み出し、符号化復号化部２１１に対するデータの入出力を制御し、さらに、フレームメモリ２１０に対するデータの書き込み及び読み出しを制御する。ラインメモリ２０４は、前記第１実施例におけるラインメモリ１０４と同じ目的に利用されるものである。図１０に、記憶部２０２中の１つの記憶セルを示す。各記憶セルはｎビットのビット深さを持つシフトレジスタＳＲとマルチプレクサＭＵＸから構成され、それぞれが互いに独立している。
【００４９】
本実施例の符号化復号化装置においては、前述の如く、ウェーブレット変換又は逆ウェーブレット変換のためのフィルタ部が、水平処理のための行方向フィルタ部２０６と垂直処理のための列方向フィルタ部２０７とに分離されている。水平処理時にはフィルタ演算に必要なデータが行方向アドレスデコード／データ選択部２０８を介して行方向フィルタ部２０６に入力され、その演算結果が行方向アドレスデコード／データ選択部２０８を介して記憶部２０２に書き込まれる。垂直処理時には、フィルタ演算に必要なデータが列方向アドレスデコード／データ選択部２０９を介して列方向フィルタ部２０７に入力され、その演算結果が列方向アドレスデコード／データ選択部２０９を介して記憶部２０２に書き込まれる。このようなフィルタ部に関連した構成を除けば、本実施例の符号化復号化装置の全体的な動作は基本的に前記第１実施例の場合と同様であり、ラインメモリ２０４の利用方法も同様である。
【００５０】
しかし、記憶部２０２は、その各記憶セルが図１０に示すように完全に独立しおり、主制御部２００から発行されるアドレスを行方向アドレスデコード／データ選択部２０６及び列方向アドレスデコード／データ選択部２０９でデコードすることにより、任意の記憶セルに対し直接的に書き込み／読み込みを行うことができる。したがって、フレームメモリ２１０から記憶部２０２へのデータ転送の終了直後、数サイクルで全ての行方向の処理（水平処理）を終了させることができ、また、水平処理の終了直後、数サイクルで列方向の処理（垂直処理）を終了させることができる。すなわち、前記第１実施例に比べ、内部処理の動作をはるかに高速化することができる。
【００５１】
図１１は、前記第１実施例の場合と同じタイリングにおける、本実施例の符号化復号化装置のウェーブレット変換処理のタイミングチャートである。図１１において、時刻ｔ０から時刻ｔ１までがフレームメモリ２１０からのデータ読み込みとレベル１の水平処理の期間であり、時刻ｔ１〜時刻ｔ４までが内部でのレベル１の垂直処理、レベル２の水平処理と垂直処理の期間である。時刻ｔ４から時刻ｔ５までがフレームメモリ２１０へのデータ書き出しの期間である。フレームメモリ２１０に対するデータの読み出しと書き込みはそれぞれ４００サイクルと２５６サイクルである。前述のように、内部処理の時刻ｔ１〜時刻ｔ４までの期間は、内部動作がフレームメモリ２１０へのアクセスに比べ高速であれば殆ど無視できるが、ここでは１００サイクルと仮定しサイクル数を計算すると、時刻ｔ０から時刻ｔ５までの総サイクル数は７５６サイクルとなる。この処理が４つのブロックＢ００，Ｂ０１，Ｂ１０，Ｂ１１（図５参照）に対し繰り返されてレベル２の変換データが得られる。レベル４の変換データを得る場合には、Ｂ００〜Ｂ１１の４ブロックに含まれるＳＳ係数（２ＳＳ）を集め１つのブロックとして処理される。この処理では９画素、９ライン（０から数えて）となるので、サイクル数はブロックＢ００の処理の４分の１、すなわち１８９サイクルとなる。したがって、レベル４までのウェーブレット変換処理にかかる総時間は
（４００＋２５６＋１００）＊（４＋１／４）＝３２１３
サイクルとなる。また、内部処理を５０サイクルと仮定すれば、総時間は
（４００＋２５６＋５０）＊（４＋１／４）＝３０００
サイクルまで減少する。
【００５２】
このように、本実施例によれば、ウェーブレット変換処理の総時間を従来技術での５４４０に比べ大幅に短縮できることは明かであり、さらに、前記第１実施例と比較しても更なる高速処理が可能であることが理解されよう。
【００５３】
符号化復号化部２１１による符号化処理と復号化処理は、前記第１実施例と同様であるので説明を省略するが、記憶部２０２がウェーブレット変換処理のためのバッファ領域として、また符号化／復号化処理のためのバッファ領域としても共用されるため、それぞれのための記憶部を別々に用意する構成にくらべ、記憶部の占めるチップ面積を増大させることなく、高速処理を実現できる。
【００５４】
前記第１実施例又は第２実施例においては、前述のようにラインメモリ（１０４，２０４）に、外部から入力された元データがそのまま書き込まれたが、本発明の第３の実施例によれば、同様の構成の符号化復号化装置装置において、ウェーブレット変換のフィルタの特性を活用し、ラインメモリに対し処理の中間データ（本発明では低域通過型フィルタの出力）が書き込まれる。これにより、他の条件が前記各実施例と同じならば、図１２に示すように、ラインメモリ（１０４，２０４）の行方向のサイズＸは前記各実施例の場合と同様にフレームメモリ
（１１０，２１０）の行方向サイズと同じであるが、列方向のサイズＹは前記各実施例の場合の２から１へと半減させることができる。このようなラインメモリの容量削減の効果は、フレームメモリの行方向サイズが大きいほど、また、高域通過型フィルタのタップ数が大きいほど顕著である。
【００５５】
本発明の第４の実施例によれば、前記第１実施例、第２実施例又は第３実施例と同様の構成において、図１３に示すように、記憶部（１０２，２０４）のサイズを２行、２列だけ小さくすることができる。これは、本実施例では、記憶部のオーバーラップ領域（図１３の斜線領域）に対し、フィルタの特性を活用し、処理の中間データ（本発明では低域通過型フィルタの出力）が書き込まれるからである（前記各実施例では、オーバーラップ領域に外部から入力された元データがそのまま書き込まれる）。本実施例による記憶部の容量削減効果は、高域通過型フィルタのタップ数が大きいほど顕著である。
【００５６】
前記各実施例においては、一度に処理できる画像のサイズが記憶部（１０２，２０２）のサイズによって制限され、したがって、要求されるウェーブレットレベルの数が増えた場合は、あるレベルまでの結果をフレームメモリ（１１０，２１０）へ一旦書き出し、そのＳＳ係数を再び読み込んで処理するという再帰的な処理方法で対処する必要がある。
【００５７】
本発明の第５の実施例によれば、前記各実施例と同様の構成において、予め要求されるウェーブレット変換の最大レベルが分かっている場合に、その最大レベルまでの全レベルのウェーブレット変換の処理を内部で連続して実行するために必要な記憶容量を、記憶部（１０２，２０２）に持たせる。例えば、レベル６までのウェーブレット変換が要求される場合には、記憶部（１０２，２０４）の大きさは（６４行×６４列＋オーバーラップ領域）に決定される。このような大きさに設定するならば、上に述べたような再帰的な処理を行うことなく、読み込んだデータに対し内部処理でレベル１からレベル６までの全レベルのウェーブレット変換を行い、その結果をフレームメモリに書き出すことができるため、内部処理に比べ速度の遅いフレームメモリに対する読み出し及び書き込みの回数が少なくなる分、処理の一層の高速化が可能となる。
【００５８】
【発明の効果】
以上の説明から明らかな如く、本発明の符号化復号化装置は、従来技術のようなウェーブレット変換又は逆ウェーブレット変換にパイプライン処理を必要としない構成であり、ウェーブレット変換の処理画素数やレベル数の変更に容易に対応でき、また、ウェーブレット変換又は逆ウェーブレット変換のためのバッファ領域と符号化又は復号化のためのバッファ領域として同じ記憶部を共用する構成であるため、記憶部のためのチップ面積を増大させることなく高速の処理が可能である。ここで、請求項１の符号化復号化装置は、記憶部が任意にアクセス可能な独立した記憶セルからなるため、さらなる高速処理が可能である。請求項２の符号化復号化装置は、外部からのデータ入力とウェーブレット変換の水平処理との並行化により、一層の高速化が可能である。請求項３又は４の符号化復号化装置は、ラインメモリ及び／又は記憶部のためのチップ面積をさらに減らすことができる。請求項５の符号化復号化装置は、要求される最大レベルまでの全レベルのウェーブレット変換を内部で連続して実行できるため、より一層の高速処理が可能である。
【図面の簡単な説明】
【図１】本発明の第１実施例を示すブロック図である。
【図２】第１実施例による記憶部の記憶セルの構成を示すブロック図である。
【図３】記憶部の行方向のデータフローの説明図である。
【図４】レベル１の水平処理終了段階での記憶部のマッピングを示す図である。
【図５】フレームメモリに対する記憶部とラインメモリの割り当て方を例示する図である。
【図６】各ブロック処理時の記憶部とラインメモリに関するデータフローを説明する図である。
【図７】レベル２までのウェーブレット変換を終了した時点での記憶部のマッピングを示す図である。
【図８】第１実施例において１ブロックに対しレベル２までのウェーブレット変換を行う場合のタイミングチャートである。
【図９】本発明の第２の実施例を示すブロック図である。
【図１０】第２実施例における記憶部の１つの記憶セルを示す図である。
【図１１】第２実施例において１ブロックに対しレベル２までのウェーブレット変換を行う場合のタイミングチャートである。
【図１２】本発明の第３実施例を説明するための図である。
【図１３】本発明の第４実施例を説明するための図である。
【図１４】従来例を示すブロック図である。
【図１５】従来例における符号化復号化部のブロック図である。
【図１６】ウェーブレット変換の水平処理及び垂直処理の演算方法の説明図である。
【図１７】イメージデータのメモリマップを示す図である。
【図１８】レベル１のＳ係数及びＤ係数のメモリマップを示す図である。
【図１９】レベル１のＳＳ係数、ＳＤ係数、ＤＳ係数及びＤＤ係数のメモリマップを示す図である。
【図２０】レベル２のＳ係数及びＤ係数のメモリマップを示す図である。
【図２１】レベル２のＳＳ係数、ＳＤ係数、ＤＳ係数及びＤＤ係数のメモリマップを示す図である。
【図２２】従来例のタイミングチャートである。
【符号の説明】
１００主制御部
１０１ウェーブレット変換部
１０２記憶部
１０４ラインメモリ
１０６行方向制御部
１０８列方向制御部
１０９符号化復号化部
１１０外部のフレームメモリ
２００主制御部
２０２記憶部
２０４ラインメモリ
２０６行方向フィルタ部
２０７列方向フィルタ部
２０８行方向アドレスデコード／データ選択部
２０９列方向アドレスデコード／データ選択部
２１０外部のフレームメモリ
２１１符号化復号化部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an encoding / decoding apparatus for compressing / decompressing image data and the like, and more particularly to an encoding / decoding apparatus using wavelet transform.
[0002]
[Prior art]
Data compression is a very useful tool for storing and transmitting large amounts of data. For example, the time required for facsimile transmission of a document and transmission of an image such as the World Wide Web can be drastically shortened by reducing the number of bits required for image reproduction using compression.
[0003]
Conventionally, many different data compression methods exist. The most widely used compression method is JPEG (Joint Photographic Experts Group). In JPEG, input symbols or luminance data are quantized and then converted to output codewords. Quantization aims to remove important information while preserving important information in the data. A transformation is used to concentrate energy prior to quantization. In JPEG, DCT (Discrete Cosine) is used as this transformation.
Transform) is used. However, various disadvantages have been pointed out with respect to JPEG using DCT. For example, block noise and mosquito noise (so called because mosquitoes appear to fly). In image signal processing, there is a focus on pursuing an efficient and highly accurate data compression encoding method that can eliminate these drawbacks. One such scheme is the wavelet pyramid processing scheme.
[0004]
When applying wavelet transform to two-dimensional signals such as image data, the horizontal direction low-pass filter HL (Horizontal Low) and the horizontal direction high-pass filter HH (Horizontal High) are used for the input signal. Are separated into a horizontal low-frequency signal (S (smooth) coefficient) and a horizontal high-frequency signal (D (detail) coefficient), and a vertical low-pass filter VL (Vertical Low) with respect to the S coefficient and D coefficient. ) And vertical high-pass filter VH (Vertical High), respectively, horizontal low-vertical low-frequency signal (SS coefficient), horizontal low-vertical high-frequency signal (SD coefficient), horizontal The direction high band-vertical low band signal (DS coefficient) and the horizontal high band-vertical high band signal (DD coefficient) are separated. The series of processes described above is called a level, and an output obtained by performing one horizontal process and vertical process is called a level 1 output. Furthermore, the above four types of signals are called frequency band signals. This process is recursively performed on the SS coefficient when an output of level 2 or higher is desired. In the level 2 output, seven frequency band signals of SS coefficient, 1SD coefficient and 2SD coefficient, 1DS coefficient and 2DS coefficient, 1DD coefficient and 2DD coefficient are obtained. In the above description, the filter is first applied in the horizontal direction and then the filter is applied in the vertical direction, but the order may be reversed.
[0005]
Each frequency band signal obtained through the above process is encoded by the encoding unit. Encoding is performed in units of bits for each frequency band signal. The most significant bit (MSB) of the first pixel of a certain frequency band signal is processed. The output of the pixel itself is determined by referring to the state of the pixel itself, the state of the surrounding pixels, and the state one level higher. Next, the MSB of the second pixel is to be processed. At this time, the state of the pixel processed first is also referred to. Hereinafter, when a series of processing is completed for the region to be encoded, the MSB next bit (MSB-1) of the first pixel becomes the processing target. At this time, the state of the MSB is also referred to in addition to the state of the peripheral pixels having the same bit depth. In this way, encoding is performed up to the least significant bit (LSB) for the state to be encoded. Decoding of the encoded data is also performed through almost the same procedure.
[0006]
FIG. 14 shows a conventional configuration when processing up to level 4 is performed. In the figure, 1000 is a wavelet transform unit, 1001 is an interface, 1002 is a frame memory, and 1003 is an encoding / decoding unit.
[0007]
In the wavelet transform unit 1000, filter1H, filter2H, filter3H, and filter4H are horizontal filters including a horizontal low-pass filter HL and a horizontal high-pass filter HH. Numbers 1 to 4 in these filter names represent levels, and H means a horizontal filter. Similarly, filter1V1 and filter1V2, filter2V1 and filter2V2, filter3V1 and filter3V2, and filter4V1 and filter4V2 are vertical filters including a vertical low-pass filter VL and a vertical high-pass filter VH. V in these filter names means a vertical filter, the numbers 1 to 4 before V represent levels, and the number 1 after V inputs a horizontal low-frequency signal (S coefficient). The numeral 2 after V indicates that the filter receives a horizontal high-frequency signal (D coefficient). The above filter may have any configuration, but in the following description, the horizontal low-pass filter HL and the vertical low-pass filter VL are 2-tap that performs calculations using two sets of data. This filter shall be used. Further, as the horizontal high-pass filter HH and the vertical high-pass filter VH, among the S coefficient or D coefficient that is the output of the low-pass filter HL or VL, the current position, the previous one, and It is assumed that a 6-tap filter that performs calculation using a total of three sets of data after one is used.
[0008]
As shown in FIG. 15, the encoding / decoding unit 1003 includes a processing unit 1010, a storage unit 1011, and a control unit 1012. The storage unit 1011 includes a storage element B 1014 that holds a target frequency band signal, a storage element A 1013 that holds a frequency band signal one level higher than the target frequency band signal, and address generation for these storage elements Part 1015. The storage element A 1013 and the storage element B 1014 have a depth of n bits and are configured to be able to read or write in bit units if high speed processing is necessary. The configuration is such that reading is performed in word units, the target bit is written, and writing is performed again in word units.
[0009]
An example of processing in the wavelet transform unit 1000 when the above-described filter is used is shown in FIG. However, it should be noted that the data mapping in this figure is for explaining the calculation method, and the actual mapping to the memory is as shown in FIGS.
[0010]
(A) of FIG. 16 explains the processing of the horizontal filter, [00] means the 0th pixel data of the 0th line, and [12] means the 2nd pixel data of the 1st line. (In this way, both lines and pixels are counted from 0). The output [S00] of the 0th pixel of the horizontal low-pass filter HL is obtained from the data [00] and the data [01], and the output [S01] of the first pixel is the data [02] and the data [01]. 03]. On the other hand, the output [D00] of the 0th pixel of the horizontal high-pass filter HH is the data (00), data [00], the data two times before and one before the data [00] (not existing). It is obtained from [01], data [02] and data [03]. Here, in order to obtain data before and after data [00] that does not exist, a process called mirror is performed. Specifically, a process of turning back the data in a mirror image relationship is performed. As a result, the previous data and the previous data become data [01] and data [00]. In this way, [D00] is calculated from the data of 6 pixels.
[0011]
FIG. 16B illustrates the vertical filter processing. This process is performed in the vertical direction using the S coefficient and the D coefficient by the horizontal filter process. The non-existent coefficient is subjected to mirror processing as in the case of the horizontal filter processing.
[0012]
FIG. 17 shows image data stored in the frame memory 1002 in raster order. 18 to 21 illustrate a method for storing the calculation results up to level 2 of the wavelet processing. First, the wavelet transform unit 1000 reads the stored image data from the frame memory 1002 as shown in FIG. 17, performs horizontal processing, and writes the result in the frame memory 1002 again. In this writing, the S coefficient and the D coefficient are written by mapping as shown in FIG. 18 so that unprocessed data is not overwritten. In FIG. 18, [1S00] means the S coefficient of the address 00 of level 1. FIG. 19 shows an example of mapping when writing each coefficient after performing vertical processing. This is the method of storing each level 1 coefficient. FIG. 20 shows an example of a method for storing level 2 horizontal coefficients. Note that the shaded portion of data is not used because level 2 processing is only performed on 1SS coefficients. Next, each level 2 coefficient is stored by mapping as shown in FIG. 21, and the level 2 processing is completed. The above processing is repeated up to level 4.
[0013]
FIG. 22 is a timing chart in the configuration shown in FIG. However, it should be noted that this timing chart is used for explaining the processing procedure, and the scale of the horizontal axis (time axis) is not linear. In the following description, the number of pixels or the number of lines is counted from 0, such as 0th pixel or 0th line. The input image data (raster data) is 32 pixels × 32 lines (from the 0th pixel to the 31st pixel, from the 0th line to the 31st line), and one data delimiter (× = ×) corresponds to one line. It shall be.
[0014]
From time t0, the 0th line data is sequentially input from the 0th pixel, and when the 1st pixel is input, the 0th pixel data [1S00] is output from the filter 1H. Next, when the data [1S01] is output, three sets of S coefficients ([1S00], [1S00], [1S01]) necessary for calculating the D coefficient are prepared (the previous data is obtained by mirror processing). D coefficient [1D00] is output. This is repeated for one line. In the timing chart, it is shown in units of time of one line, but it should be noted that if it is enlarged, a deviation in units of pixels occurs.
[0015]
Input of data on the first line starts from time t1, and [1S10], [1D10], and S and D coefficients are sequentially output from filter 1H. When [1S10] is output, [1SS00] is output from filter1V1 of the vertical filter, and [1DS00] is output from filter1V2. Next, when [1S11] is output, three sets of data necessary for calculation of the D coefficient are prepared in filter1V1 and filter1V2. That is, [1S10], [1S10], [1S11] in filter1V1, and [1D10], [1D10], [1D11] are aligned in filter1V2 (the previous data is obtained by mirror processing), and level 1 Output data [1SS00], [1SD00], [1DS00], and [1DD00] are obtained. This is repeated for one line.
[0016]
At time t2, input of the first line of 2V is started and 2V processing is started. Thereafter, the processing is repeated until time t9 with the same timing relationship, and each frequency band signal up to level 4 is output. Each of these data is written into the frame memory 1002.
[0017]
Each frequency band signal written in the frame memory 1002 is encoded by the encoding / decoding unit 1003. The encoding / decoding unit 1003 increases the compression rate by taking advantage of the characteristic that the correlation between adjacent pixels of the image signal, particularly, the correlation within the same bit plane is high. For this reason, at the time of encoding, it is necessary to handle data in a certain area in units of bits (any one bit of data of a certain pixel). Since the size of the image data is usually very large (in some cases, it may reach several MB), the data after the wavelet processing needs to be once written in the frame memory 1002. However, the frame memory 1002 cannot read or write in bit units. Therefore, the encoding / decoding unit 1003 loads the data from the frame memory 1002 to the storage elements 1013 and 1014 that can be processed in units of bits prepared therein, performs the encoding process, and outputs the encoded data code. To do. Decoding is performed in an approximately reverse order of the operations described above.
[0018]
Next, processing time for wavelet transform will be described. Here, it is assumed that a general semiconductor memory is used as the storage of each frequency band signal generated by the wavelet transform unit 1000, that is, the frame memory 1002 in FIG.
[0019]
As described with reference to the timing chart of FIG. 22, since each frequency band signal is output in parallel at the same time, writing to the external memory must also be performed in parallel. Only one data can be read or written at a time. The range in the lower left of FIG. 22 corresponds to 1H, 1V,. . . , 4V represents the range of processing time occupied by ← →. The r / w cycles below the range is the total number of writing / reading within the range (← →) within the range, but the number of times in the range where different levels are processed simultaneously is the number of times for each level. Shown in total. The numerical value shown on the right side of FIG. 22 is the number of memory accesses (total number of writing and reading) required for horizontal processing or vertical processing at each level. This value is the same during inverse wavelet transformation. Now, regarding the number of memory accesses, at each level, both horizontal processing and vertical processing always read out all data once, and all data is rewritten with filter output data. Writing / writing times is required.
[0020]
For more detailed information on the encoding / decoding apparatus and the wavelet transform apparatus or wavelet transform filter related to the present invention as described above, refer to Japanese Patent Laid-Open No. 8-139935. For the encoding unit, refer to JP-A-9-121168. Further, regarding conventional techniques relating to similar wavelet transform failure, refer to Japanese Patent Laid-Open Nos. 3-27687, 5-167997, or 5-183386.
[0021]
[Problems to be solved by the invention]
As described above, the output of the wavelet transform needs to be temporarily stored in the storage, and there is a problem that the processing time is several times longer than the time required to simply input data. In the case of the above-described prior art, when the size of the input image data is 32 pixels × 32 lines and the number of levels is 4, the number of cycles required to input the image data is 1024 = 32 × 32, The required processing time is 5440 cycles, which is five times or more. Obviously, if the size of the input data increases, the processing time will increase significantly. For example, in the case of 64 pixels × 64 lines, as shown by a dotted line in FIG. 22, as a result of the 1H processing being performed until time t10, the number of sections output in parallel increases, so the processing time increases significantly. Similarly, when the number of levels increases, the processing time increases significantly.
[0022]
Moreover, since each frequency band signal of each level is output at the same time, pipeline processing is necessary. That is, since the timing at which data is input differs for each filter, it is necessary to incorporate a controller that is individually designed in accordance with the location where each filter is used. In addition, these controllers can deal with only a combination of the number of pixels and the level of one condition, and there is a problem that it is difficult to deal with when one or both of the number of pixels or the level is changed. .
[0023]
In addition, since the storage unit for the processing is separately provided in the wavelet transform unit, and the storage unit for the processing is separately provided in the encoding / decoding unit, when the encoding / decoding device is made into one chip, There is a problem that the chip area occupied by the storage unit becomes large.
[0024]
The present invention has been made in view of the above-described problems, and a main object thereof is to provide an encoding / decoding device capable of high-speed processing without requiring pipeline processing. Another object of the present invention is to provide an encoding / decoding device capable of high-speed processing which requires less chip area occupied by a storage unit.
[0026]
[Means for Solving the Problems]
Claim 1 The encoding / decoding device includes an encoding / decoding unit for encoding a frequency band signal by wavelet transform or decoding the encoded data, a storage unit including independent storage elements, and the storage unit. A line memory for temporarily storing a part of the write data of the memory, a row address decoding / data selection unit for controlling access in the row direction to the storage unit, and a column direction for the storage unit Column direction address decode / data selection unit for controlling access, and row direction filter for performing horizontal processing of wavelet transform or inverse wavelet transform on data input via the row direction address decode / data selection unit And the wavelet transform or inverse wavelet for the data input via the column direction address decoding / data selection unit. A column direction filter unit for performing vertical processing of data conversion, writing and reading data to and from the storage unit via the row direction address decoding / data selection unit and the column direction address decoding / data selection unit, and the row direction Control input / output of data to / from the filter unit and the column direction filter unit, input / output of data to / from the encoding / decoding unit, writing / reading of data to / from the line memory, and writing / reading of data to / from the external memory And a main control unit that controls the storage unit, and the storage unit is shared as a buffer area for the wavelet transform process or the inverse wavelet transform process and a buffer area for the encoding process or the decoding process.
[0027]
Claim 2 The characteristics of the encoder / decoder are: Claim 1 In the configuration of the encoding / decoding device, horizontal processing of wavelet transform is executed in parallel with data input from the external memory.
[0028]
Of claim 3 The characteristics of the encoder / decoder are: Claim 1 or 2 In the configuration of the encoding / decoding device, the data written in the line memory is intermediate data for processing.
[0029]
Claim 4 The characteristics of the encoder / decoder are: Claim 1 or 2 In the configuration of the encoding / decoding apparatus, intermediate data for processing is written in the overlap area of the storage unit.
[0030]
Of claim 5 The characteristics of the encoder / decoder are: Claims 1 to 4 In the configuration of the encoding / decoding device of each of the above items, the storage unit has a storage capacity necessary for continuously executing wavelet transform processing of all levels up to the required maximum level internally. is there.
[0031]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 is a block diagram showing a first embodiment of the present invention. The encoding / decoding apparatus of the present embodiment includes a main control unit 100, a wavelet transform unit 101 for wavelet transform or inverse wavelet transform, and encoding of a frequency band signal by wavelet transform or decoding of the encoded data. Encoding / decoding unit 109, storage unit 102, line memory 104, row direction control unit 106, and column direction control unit 108. Reference numeral 110 denotes an external frame memory connected to the present encoding / decoding device. The row direction control unit 106 controls access to the storage unit 102 in the row direction (x direction), and the column direction control unit 108 controls access to the storage unit 102 in the column direction (y direction). The main control unit 100 controls writing and reading of data to and from the storage unit 102 via the row direction control unit 106 and the column direction control unit 108, and controls writing and reading of data to and from the line memory 104, and data to the frame memory 110. And the input and output of data to the wavelet transform unit 101 and the encoding / decoding unit 109 are controlled.
[0032]
The storage unit 102 is shared as a buffer area for processing of the wavelet transform unit 101 and the encoding / decoding unit 109. Therefore, it is not necessary to provide a conventional storage unit inside the wavelet transform unit 101 and the encoding / decoding unit 109. In this embodiment, the storage unit 102 includes a shift register that can shift data in the row direction (x direction) and the column direction (y direction). Here, the size of the storage unit 102 is assumed to be 20 rows × 20 columns (20 pixels × 20 lines). FIG. 2 shows a configuration of a memory cell in the memory unit 102. Each storage cell has a bit depth of n bits.
[0033]
Hereinafter, the present embodiment will be described in detail with reference to FIGS. 1, 2, 3 to 8. 3 shows an example of how to write data to the storage unit 102, and FIG. 4 shows an example of how to write data after horizontal processing in the storage unit 102. FIG. 5 shows an example of how the storage unit 102 and the line memory 104 are allocated (tiling) to the frame memory 110. FIG. 6 is a diagram for explaining data movement and copying related to the storage unit 102 and the line memory 104. FIG. FIG. 7 is a diagram illustrating an example of mapping in the storage unit 102 at the time when the level 2 is completed. FIG. 8 is a timing chart of the conversion processing operation up to level 2 for one block. It should be noted that the content of each drawing is an exemplification, and can take various forms without departing from the gist of the present invention.
[0034]
In FIG. 2, each part surrounded by a dotted line represents one storage cell of the storage unit 102. As illustrated, each storage cell of the storage unit 102 includes a shift register SR having a bit depth of n bits and a multiplexer MUX. To the data input of the shift register SR of each memory cell, the output data of the memory cell in the previous stage (right side) in the row direction or the output data of the memory cell in the previous stage (lower side) in the column direction is input via the multiplexer MUX. The This switching of input data, that is, switching between data shift in the row direction or data shift in the column direction is controlled by a control input hvb (horizontal / vertical bar) to the multiplexer MUX. As is apparent from the above description, the data shift in the row direction (x direction) is performed from right to left, and the data shift in the column direction (y direction) is performed from bottom to top. become.
[0035]
When performing wavelet transform, first, the data of the 0th line of the image data stored in the frame memory 110 is read out under the control of the main control unit 100, and this is read by the storage unit 102 via the row direction control unit 106. The data is input to the foremost stage (right end) of the lowest j-th line (see FIG. 4) and sequentially shifted leftward. However, since mirror processing is required for the 0th pixel and the 1st pixel, they are written in the storage unit 102 in that order from the beginning. The shaded portion in FIG. 3 shows pixel data subjected to mirror processing. When all the data of the 0th line is thus written as shown in FIG. 3B, this data is shifted up one row. Next, the data for the first line is similarly read from the frame memory 100 and input to the bottom row of the storage unit 102 by the row direction control unit 106 and sequentially shifted to the left. In parallel with the data input, horizontal processing is executed by the wavelet transform unit 101, and the S coefficient and D coefficient calculated for the 0th line data are stored in the storage unit 102 via the row direction control unit 106. The line data is input to the line in which it is written and sequentially shifted to the left. FIG. 3C shows a state during the horizontal processing of the 0th line and the data input processing of the 1st line. Similar image data input and horizontal processing are repeatedly executed in parallel, so that the result of the horizontal processing is finally mapped in the storage unit 102 as shown in FIG. Data is written to or read from the line memory 104, which will be described later.
[0036]
In FIG. 4, it should be noted that the same data as the third and second rows is written in the 0th and 1st rows of the storage unit 102. This is mirror processing in the vertical direction. Specifically, at the stage where the processing up to the j-th row is completed, the data of the third row is written into the 0-th row under the control of the row direction control unit 106, and the second row Row data is written to the first row. Note that the same data as the g-th column and the h-th column is written in the i-th column and the j-th column of each row. This is performed under the control of the row direction control unit 106 in the course of horizontal processing, and the reason will be described later.
[0037]
Now, vertical processing is performed on the S coefficient and D coefficient on the storage unit 102 mapped as shown in FIG. However, it should be noted that the 0th column and the 1st column are not subject to processing. In the case of vertical processing, the data in the second column is sequentially shifted in the column direction (upward) under the control of the column direction control unit 108, and the shifted-out data is input to the wavelet transform unit 101 via the main control unit 100. Then, SS coefficients and SD coefficients are calculated, and these coefficients are sequentially input to the second column of the storage unit 102 and shifted under the control of the column direction control unit 108. Similarly, the data in the third column is shifted in the column direction, and the shifted-out data is input to the wavelet transform unit 101 to calculate DS coefficients and DD coefficients, which are input to the third column and shifted. By repeating the same processing up to the h-th column, the level 1 wavelet transform is completed.
[0038]
The level 2 wavelet transform is performed only on the level 1 SS coefficient (1SS) on the storage unit 102. That is, the data of each row is shifted in the row direction under the control of the row direction control unit 106, and the shifted-out 1SS coefficient is sent to the wavelet transform unit 101 to calculate the 2S coefficient and 2D coefficient. Along with the level 1 data (1SD coefficient, 1DS coefficient, 1DD coefficient), the data is input to the corresponding row and sequentially shifted in the order as shown in FIG. By repeating this, level 2 horizontal processing ends. The 0th and 1st rows are not subject to processing. In addition, the data in the i-th column and the j-th column of each row are also not subject to processing. Next, level 2 vertical processing is performed. For each column from the second column to the g-th column, the data is sequentially shifted in the column direction under the control of the column direction control unit 108, and the shifted 2S coefficient or 2D coefficient is input to the wavelet transform unit 101 to obtain the 2SS coefficient. And 2SD coefficient, or 2DS coefficient and 2DD coefficient are calculated, and these coefficients are input to the column and sequentially shifted in the order of mapping as shown in FIG. 21 together with the shifted-out data. This process is repeated to finish the level 2 process. Thus, the wavelet transform results up to level 2 are mapped on the storage unit 102 as shown in FIG. The 16 pixel × 16 line data is written to the frame memory 110. When performing conversion of level 3 or higher, only the SS3 coefficient on the frame memory 110 is read into the storage unit 102 and the same processing is performed.
[0039]
FIG. 5 shows a tiling method when the size of the frame memory 110 is larger than the size of the storage unit 102. B00 means a block (horizontal 0, vertical 0). Here, since the size of the storage unit 102 is 20 pixels × 20 lines, x0 in FIG. 5 is the 15th pixel, 2x0 is the 31st pixel (counting from 0), and y0 is the 15th line, 2y0. Is the 31st line (x0 = y0 = 15, counting from 0). Therefore, when processing an image of 32 pixels × 32 lines (from the 0th pixel to the 31st pixel, from the 0th line to the 31st line), it is divided into four blocks B00, B01, B10, and B11 as shown in the figure. (This figure is for comparison with the prior art). Next, processing of each block will be described with reference to FIG.
<Process of block B00: FIG. 6 (a)>
For the block B00, image data of 18 pixels × 18 lines from the 0th pixel to the 17th pixel and from the 0th line to the 17th line is read into the storage unit 102. Since data (1) shown in FIG. 6 (a) does not actually exist, the data is supplemented by mirror processing. The data of the portion (2) (corresponding to the g-th and h-th rows in FIG. 4) necessary for the processing of the lower adjacent block B10 but overwritten by the conversion data of the block B00 is stored in the line memory 104 in advance. Copied. Further, the data of the portion (3) (corresponding to the g-th column and the h-th column in FIG. 4) overwritten by the conversion data of the block B00 can be used in the processing of the lower adjacent block B01. Therefore, it is copied in advance to the right end (the i-th column and the j-th column in FIG. 4) of the storage unit 102. Conversion is performed on the block B00, and the converted data (16 pixel × 16 line data shown in FIG. 7) is written in the area from the 0th pixel to the 15th pixel and from the 0th line to the 15th line of the frame memory 110. It is. Next, block B01 is processed.
<Process of block B01: FIG. 6B>
For the block B01, image data of 18 pixels × 18 lines from the 16th pixel to the 33rd pixel and from the 0th line to the 17th line is read into the storage unit 102. At this time, the data of the portion {circle around (3)} at the right end of the storage unit 102 is moved to the left end (0th column and 1st column in FIG. 4). The data of (1) where no data exists is supplemented by mirror processing. The data of the portion (4) (corresponding to the g-th and h-th rows in FIG. 4) necessary for the processing of the lower adjacent block B11 but overwritten with the conversion data of the block B01 is stored in the line memory 104 in advance. Copied. The conversion data for the block B01 is written in the area from the 16th pixel to the 31st pixel and from the 0th line to the 15th line of the frame memory 110. The process proceeds to block B10.
<Block B10: FIG. 6C>
For the block B10, image data from the 0th pixel to the 17th pixel and from the 16th line to the 33rd line are read into the storage unit 102. The data of (1) where no data exists is supplemented by mirror processing. Since the portion {circle around (2)} is overwritten with the conversion data of the upper adjacent block B00, the portion of data saved in the line memory 104 is written. The data of (6) is copied to the right end of the storage unit 102 so that it can be used in the processing of the block B11 on the right. The data of the part (5) overwritten with the conversion data of the block B10 is copied to the line memory 104. The converted data of the block B10 is written in the area from the 0th pixel to the 15th pixel and from the 16th line to the 31st line of the frame memory 110. Next, the process proceeds to block B11.
<Process of block B11: FIG. 6 (d)>
Image data from the 16th pixel to the 33rd pixel in the frame memory 110 and from the 16th line to the 33rd line are written in the storage unit 102. At this time, the data of the part (6) at the right end of the storage unit 102 is moved to the left end. The data saved in the line memory 104 is written in the portion (4). The data of the portion (7) is copied to the line memory 104. The conversion data for this block B11 is written in the area from the 16th pixel to the 31st pixel and from the 16th line to the 31st line of the frame memory 110.
[0040]
FIG. 8 is a timing chart when performing wavelet transform up to level 2 for each block described above. The period from time t0 to time t1 is a period during which data is read from the frame memory 110 to the storage unit 102 and level 1 horizontal processing is performed, and from time t1 to time t4 is internally level 1 vertical processing to level 2 This is a period during which vertical processing is performed, and a period from time t4 to time t5 is a period during which converted data is written to the frame memory 110. The number of data read and write cycles for the frame memory 110 is 400 cycles each.
(= 20 × 20) and 256 cycles (= 16 × 16). Although the internal operation without access to the frame memory 110 can be much faster than the frame memory access, it is assumed here that the internal operation is at the same operation speed as the frame memory access. When the number of cycles in the period is calculated, it becomes 800 cycles. Therefore, the total number of cycles from time t0 to time t5 is 1456 cycles.
[0041]
The same processing is repeated for the four blocks B00, B01, B10, and B11, and level 2 conversion data is obtained. When obtaining level 4 conversion data, SS coefficients (2SS) included in the four blocks B00 to B11 are collected and processed as one block. Since this process has 9 pixels and 9 lines (counting from 0), the number of cycles is a quarter of the process of block B00, that is, 364 cycles. Therefore, the total time required for wavelet transform processing up to level 4 is
(400 + 256 + 800) * (4 + 1/4) = 6188
It becomes. However, this is a value assuming that the internal operation speed is the same as that of frame memory access. Actually, it is easy to make internal operation several times faster than frame memory access. Further shortening is possible. For example, assuming the speed of internal operations is twice that of frame memory access, the total time is
(400 + 256 + 800/2) * (4 + 1/4) = 4488
Decreases to cycle. Assuming that the speed of internal operations is four times the frame memory access, the total time is
(400 + 256 + 800/4) * (4 + 1/4) = 3638
Decreases to cycle. Thus, according to the present invention, it is apparent that the total time can be shortened from the total time of 5440 cycles in the case where similar processing is performed in the prior art.
[0042]
When the entire (or part) wavelet transform of one frame is completed as described above, the frequency band signal data on the frame memory 110 is read out under the control of the main storage unit 100, and the row direction control unit 106 and the column direction are read out. The data is written in the storage unit 102 via the control unit 108. Then, the encoding / decoding unit 109 performs an encoding process on the frequency band signal data on the storage unit 102, and outputs encoded data code. That is, at this time, the storage unit 102 is used as a buffer area for encoding processing, and data input / output between the encoding / decoding unit 109 and the storage unit 102 is performed by the main control unit 100 and the row / column direction control unit 106. , 107.
[0043]
The encoding will be described in more detail. For example, 4DS, 4SD, 4DD,. . . Each time, the MSB (most significant bit) is processed in order from the lower bit in units of bit planes (one two-dimensional bit plane having the same bit depth). The encoding process is performed in units of 2 (x direction) × 8 (y direction) pixels of the bit plane (when there is data of 2 × 8 size. ). What is actually processed is each unit of the above-mentioned size, but since the periphery is also referred to, the data of the area including the periphery, for example, 4 × 10 pixel data is read from the frame memory 110 to the storage unit 102. It is. Further, when there is a frequency band signal of the same type and one level above, it is also referred to, and is read from the frame memory 110 in the same manner. When the processing of the highest-order bit plane is completed, the next lower bit plane is processed in the same manner. By repeating this, processing of one type of frequency band signal of one level is completed. This is performed for all types of frequency band signals at all levels, and the encoding ends.
[0044]
The above is the description in the case of encoding, but decoding is performed by the reverse procedure of encoding. That is, encoded data code input from the outside is decoded by the encoding / decoding unit 109, and the frequency band signal data is restored on the storage unit 102. At this time, the storage unit 102 is used as a buffer area for the decoding process. The restored frequency band signal data is written to the frame memory 110.
[0045]
Described in more detail, a certain type of frequency band signal, for example, a 4DD coefficient, is decoded from the MSB in bit units from the encoded data code, and a bit plane is reproduced. Decoding is also performed in units of 2 (x direction) × 8 (y direction) pixels of the bit plane (when data of 2 × 8 size exists. If smaller, in units of that size) . If there is a higher level frequency band signal of the same type, it is also referred to. When the processing of the bit plane is completed, a bit plane having a bit depth one level lower is processed. Similar processing is performed for all types of frequency band signals at all levels.
[0046]
In this way, when the frequency band signal data of the entire frame (or a part) is restored on the frame memory 110, the frequency band signal data for one tile is read into the storage unit 102, and the wavelet transform unit 101 is The inverse wavelet transform process is performed using this. Inverse wavelet transform processing is performed from level 4, and vertical processing and horizontal processing at each level are performed in this order. First, the 3SS coefficient is reproduced from the SS coefficient and the 4SD, 4DS, and 4DD coefficients, and this is overwritten on the SS coefficient and the 4SD, 4DS, and 4DD coefficients. A 2SS coefficient is reproduced from the reproduced 3SS coefficient and each coefficient of 3SD, 3DS, and 3DD, and this is overwritten on each coefficient of 3SS, 3SD, 3DS, and 3DD. Similarly, the reverse procedure of the wavelet transform is repeated, and finally the image data is restored, written in the frame memory 110, and overwritten on the frequency band signal data of the tile.
[0047]
Note that, when the coding / decoding device is made into one chip except for the external frame memory 110, it is important to suppress the increase in the chip area occupied by the storage elements. According to the present embodiment, since the storage unit 202 is shared as a buffer area for the wavelet transform unit 101 and the encoding / decoding unit 109, the chip area occupied by the storage elements can be reduced compared to a configuration in which separate storage units are prepared for each. And high-speed processing is possible as described above.
[0048]
FIG. 9 is a block diagram showing a second embodiment of the present invention. The encoding / decoding apparatus according to the present embodiment includes a main control unit 200, a storage unit 202, a line memory 204, a row direction filter unit 206 for horizontal processing of wavelet transform or inverse wavelet transform, and a column direction filter for vertical processing. Section 207, row direction address decoding / data selection section 208, column direction address decoding / data selection section 209, and encoding / decoding section 211. Reference numeral 210 denotes an external frame memory connected to the present encoding / decoding device. A row direction address decoding / data selection unit 208 controls access to the storage unit 202 in the row direction, and a column direction address decoding / data selection unit 209 controls access to the storage unit 202 in the column direction. Is. The main control unit 200 accesses the storage unit 200 via the row direction address decoding / data selection unit 208 and the column direction address decoding / data selection unit 209, and writes / reads data to / from the storage unit 202 and the filter units 206 and 207. Data input / output is controlled, data writing / reading with respect to the line memory 204, data input / output with respect to the encoding / decoding unit 211, and data writing / reading with respect to the frame memory 210 are controlled. The line memory 204 is used for the same purpose as the line memory 104 in the first embodiment. FIG. 10 shows one memory cell in the memory unit 202. Each memory cell includes a shift register SR having a bit depth of n bits and a multiplexer MUX, which are independent of each other.
[0049]
In the encoding / decoding apparatus of the present embodiment, as described above, the filter unit for wavelet transform or inverse wavelet transform includes a row direction filter unit 206 for horizontal processing and a column direction filter unit 207 for vertical processing. And are separated. During horizontal processing, data necessary for the filter operation is input to the row direction filter unit 206 via the row direction address decode / data selection unit 208, and the calculation result is stored in the storage unit 202 via the row direction address decode / data selection unit 208. Is written to. At the time of vertical processing, data necessary for the filter operation is input to the column direction filter unit 207 via the column direction address decode / data selection unit 209, and the calculation result is stored in the storage unit via the column direction address decode / data selection unit 209. 202 is written. Except for the configuration related to the filter unit, the overall operation of the encoding / decoding apparatus of this embodiment is basically the same as that of the first embodiment, and the method of using the line memory 204 is also the same. It is the same.
[0050]
However, in the storage unit 202, each storage cell is completely independent as shown in FIG. 10, and the address issued from the main control unit 200 is converted into the row direction address decoding / data selection unit 206 and the column direction address decoding / data selection. By decoding by the unit 209, it is possible to directly write / read to any memory cell. Therefore, immediately after the end of data transfer from the frame memory 210 to the storage unit 202, all row-direction processing (horizontal processing) can be completed in several cycles, and immediately after the end of horizontal processing, in the column direction in several cycles. This processing (vertical processing) can be terminated. That is, the operation of the internal processing can be made much faster than in the first embodiment.
[0051]
FIG. 11 is a timing chart of the wavelet transform process of the encoding / decoding apparatus according to the present embodiment in the same tiling as in the first embodiment. In FIG. 11, the period from time t0 to time t1 is a period of data reading from the frame memory 210 and level 1 horizontal processing, and from time t1 to time t4 is internal level 1 vertical processing and level 2 horizontal processing. And the period of vertical processing. A period from time t4 to time t5 is a period of data writing to the frame memory 210. Data read and write to the frame memory 210 are 400 cycles and 256 cycles, respectively. As described above, the period from the time t1 to the time t4 of the internal processing can be almost ignored if the internal operation is faster than the access to the frame memory 210, but here the number of cycles is calculated assuming 100 cycles. The total number of cycles from time t0 to time t5 is 756 cycles. This process is repeated for the four blocks B00, B01, B10, and B11 (see FIG. 5) to obtain level 2 conversion data. When obtaining level 4 conversion data, SS coefficients (2SS) included in the four blocks B00 to B11 are collected and processed as one block. Since this process has 9 pixels and 9 lines (counting from 0), the number of cycles is a quarter of the process of block B00, that is, 189 cycles. Therefore, the total time required for wavelet transform processing up to level 4 is
(400 + 256 + 100) * (4 + 1/4) = 3213
It becomes a cycle. If the internal processing is assumed to be 50 cycles, the total time is
(400 + 256 + 50) * (4 + 1/4) = 3000
Decreases to cycle.
[0052]
As described above, according to the present embodiment, it is clear that the total time of the wavelet transform processing can be significantly reduced as compared with 5440 in the prior art, and further, even higher speed processing than the first embodiment. It will be understood that this is possible.
[0053]
Since the encoding process and the decoding process by the encoding / decoding unit 211 are the same as those in the first embodiment, the description thereof will be omitted. However, the storage unit 202 serves as a buffer area for the wavelet transform process, and the encoding / decoding process. Since it is also used as a buffer area for decoding processing, high-speed processing can be realized without increasing the chip area occupied by the storage unit, compared to a configuration in which a storage unit for each is prepared separately.
[0054]
In the first embodiment or the second embodiment, as described above, the original data input from the outside is directly written in the line memories (104, 204). However, according to the third embodiment of the present invention. For example, in an encoding / decoding device having the same configuration, the processing intermediate data (in the present invention, the output of the low-pass filter) is written into the line memory using the characteristics of the wavelet transform filter. As a result, if other conditions are the same as those in each of the embodiments, as shown in FIG. 12, the size X in the row direction of the line memory (104, 204) is the same as that in each of the embodiments.
Although it is the same as the size in the row direction of (110, 210), the size Y in the column direction can be halved from 2 to 1 in the above embodiments. Such an effect of reducing the capacity of the line memory is more remarkable as the size of the frame memory in the row direction is larger and as the number of taps of the high-pass filter is larger.
[0055]
According to the fourth embodiment of the present invention, in the same configuration as the first embodiment, the second embodiment, or the third embodiment, the size of the storage unit (102, 204) is set as shown in FIG. It can be reduced by 2 rows and 2 columns. This is because, in the present embodiment, the intermediate data of the processing (in the present invention, the output of the low-pass filter) is written in the overlap area (shaded area in FIG. 13) of the storage unit by utilizing the characteristics of the filter. (In each of the above embodiments, the original data input from the outside is directly written in the overlap area). The capacity reduction effect of the storage unit according to the present embodiment is more remarkable as the number of taps of the high-pass filter is larger.
[0056]
In each of the embodiments described above, the size of an image that can be processed at one time is limited by the size of the storage unit (102, 202). Therefore, when the number of wavelet levels required increases, the results up to a certain level are framed. It is necessary to cope with a recursive processing method of once writing to the memory (110, 210) and reading the SS coefficient again to process it.
[0057]
According to the fifth embodiment of the present invention, when the maximum level of wavelet transformation required in advance is known in the same configuration as each of the embodiments, the wavelet transformation processing of all levels up to the maximum level is known. Is stored in the storage units (102, 202). For example, when wavelet transform up to level 6 is required, the size of the storage unit (102, 204) is determined as (64 rows × 64 columns + overlap region). If this size is set, the wavelet transform of all levels from level 1 to level 6 is performed on the read data without internal recursive processing as described above, Since the result can be written to the frame memory, the number of times of reading and writing to the frame memory, which is slower than the internal processing, can be reduced, so that the processing speed can be further increased.
[0058]
【The invention's effect】
As is clear from the above explanation, Of the present invention The encoding / decoding apparatus has a configuration that does not require pipeline processing for wavelet transformation or inverse wavelet transformation as in the prior art, and can easily cope with changes in the number of processing pixels and the number of levels of wavelet transformation. Since the same storage unit is shared as a buffer area for transform or inverse wavelet transform and a buffer area for encoding or decoding, high-speed processing is possible without increasing the chip area for the storage unit. It is. Here, in claim 1 Since the encoding / decoding device is composed of independent storage cells that the storage unit can arbitrarily access, Further High-speed processing is possible. Claim 2 The encoding / decoding device can further increase the speed by parallelizing the data input from the outside and the horizontal processing of the wavelet transform. Claim 3 or 4 The coding / decoding apparatus can further reduce the chip area for the line memory and / or the storage unit. Of claim 5 The encoding / decoding device can continuously execute all wavelet transforms up to the required maximum level internally, so that even higher speed processing is possible. Is possible .
[Brief description of the drawings]
FIG. 1 is a block diagram showing a first embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of a memory cell of the memory unit according to the first embodiment.
FIG. 3 is an explanatory diagram of a data flow in a row direction of a storage unit.
FIG. 4 is a diagram illustrating mapping of storage units at the end of level 1 horizontal processing;
FIG. 5 is a diagram exemplifying how a storage unit and a line memory are allocated to a frame memory.
FIG. 6 is a diagram illustrating a data flow regarding a storage unit and a line memory at the time of each block processing.
FIG. 7 is a diagram illustrating mapping of a storage unit at the time when wavelet transformation up to level 2 is completed;
FIG. 8 is a timing chart when performing wavelet transform up to level 2 for one block in the first embodiment;
FIG. 9 is a block diagram showing a second embodiment of the present invention.
FIG. 10 is a diagram showing one storage cell of a storage unit in the second embodiment.
FIG. 11 is a timing chart when wavelet transform up to level 2 is performed on one block in the second embodiment.
FIG. 12 is a diagram for explaining a third embodiment of the present invention.
FIG. 13 is a diagram for explaining a fourth embodiment of the present invention.
FIG. 14 is a block diagram showing a conventional example.
FIG. 15 is a block diagram of an encoding / decoding unit in a conventional example.
FIG. 16 is an explanatory diagram of a calculation method of horizontal processing and vertical processing of wavelet transform.
FIG. 17 is a diagram illustrating a memory map of image data.
FIG. 18 is a diagram showing a memory map of level 1 S coefficients and D coefficients;
FIG. 19 is a diagram showing a memory map of level 1 SS coefficients, SD coefficients, DS coefficients, and DD coefficients;
FIG. 20 is a diagram showing a memory map of level 2 S coefficients and D coefficients;
FIG. 21 is a diagram showing a memory map of SS coefficients, SD coefficients, DS coefficients, and DD coefficients at level 2;
FIG. 22 is a timing chart of a conventional example.
[Explanation of symbols]
100 Main control unit
101 Wavelet transform unit
102 storage unit
104 line memory
106 Row direction control unit
108 Column direction controller
109 Coding / decoding unit
110 External frame memory
200 Main control unit
202 storage unit
204 line memory
206 Row direction filter
207 Column direction filter
208 Row direction address decoding / data selection part
209 Column direction address decoding / data selection unit
210 External frame memory
211 Coding / decoding unit

Claims

A coding / decoding unit for encoding a frequency band signal by wavelet transform or decoding the encoded data, a storage unit comprising independent storage elements, and a part of data to be written to the storage unit A line memory for temporarily storing, a row direction address decoding / data selection unit for controlling access in the row direction to the storage unit, and a column for controlling access in the column direction to the storage unit A direction address decoding / data selection unit, a row direction filter unit for performing horizontal processing of wavelet transformation or inverse wavelet transformation on data input via the row direction address decoding / data selection unit, and the column direction address Perform vertical processing of wavelet transform or inverse wavelet transform on data input via the decode / data selector A column direction filter unit, a row direction address decode / data selection unit, and a column direction address decode / data selection unit to write and read data to / from the storage unit, and the row direction filter unit and the column direction A main control unit that controls input / output of data to / from the filter unit and controls input / output of data to / from the encoding / decoding unit, writing / reading of data to / from the line memory, and writing / reading of data to / from the external memory; An encoding / decoding apparatus comprising: the storage unit shared as a buffer area for wavelet transform processing or inverse wavelet transform processing and a buffer area for encoding processing or decoding processing.

2. The encoding / decoding apparatus according to claim 1, wherein horizontal processing of wavelet transform is executed in parallel with data input from the external memory.

3. The encoding / decoding apparatus according to claim 1, wherein the data written in the line memory is intermediate data for processing .

The encoding / decoding apparatus according to claim 1 or 2, wherein intermediate data of processing is written in an overlap area of the storage unit.

5. The encoding / decoding device according to claim 1, wherein the storage unit is necessary to continuously execute wavelet transform processing of all levels up to a required maximum level internally. A coding / decoding device characterized by having a large storage capacity.