JP4766451B2

JP4766451B2 - Encoding apparatus, image processing apparatus, encoding method, and encoding program

Info

Publication number: JP4766451B2
Application number: JP2005365759A
Authority: JP
Inventors: 雅則関野
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2005-12-20
Filing date: 2005-12-20
Publication date: 2011-09-07
Anticipated expiration: 2025-12-20
Also published as: JP2007174008A

Description

本発明は、データを符号化する符号化装置、画像処理装置、符号化方法及び符号化プログラムに関するものである。 The present invention relates to an encoding device, an image processing device, an encoding method, and an encoding program for encoding data.

画像データを効率的に符号化する画像処理装置として、画像データを文字画像領域と他の画像領域とに分離して符号化を行うものが知られている。この種の画像処理装置において、確実に符号量の総量を一定量に制御することは公知である（特許文献１参照）。また、画像データを活字、手書き文字、写真及び絵柄などの画像領域に分割し、画像領域の画像種類に適した画像データ圧縮を行うことにより、圧縮率を高めることは公知である（特許文献２参照）。 As an image processing apparatus that efficiently encodes image data, an image processing apparatus that performs encoding by separating image data into a character image area and another image area is known. In this type of image processing apparatus, it is known to reliably control the total amount of codes to a constant amount (see Patent Document 1). In addition, it is known to increase the compression rate by dividing image data into image areas such as printed characters, handwritten characters, photographs, and patterns and performing image data compression suitable for the image type of the image area (Patent Document 2). reference).

特開平７−１２３２７３号公報JP 7-123273 A 特開平７−０９９５８１号公報JP-A-7-099581

しかしながら、上記従来例においては、文字画像領域を抽出するための処理を画像全体に対して一律に行う必要があり、符号化処理の高速化と画質の向上とを両立させることができないという問題があった。 However, in the above conventional example, it is necessary to uniformly perform the process for extracting the character image area on the entire image, and it is impossible to achieve both the high speed encoding process and the improvement of the image quality. there were.

そこで、本発明は、符号化精度を低下させることなく、入力データを高速に符号化することができる符号化装置、画像処理装置、符号化方法及び符号化プログラムを提供することを目的とする。 Accordingly, an object of the present invention is to provide an encoding device, an image processing device, an encoding method, and an encoding program that can encode input data at high speed without reducing the encoding accuracy.

上記目的を達成するため、本発明の第１の特徴とするところは、入力データを第１の縮小率で縮小する第１の縮小手段と、この第１の縮小手段が縮小した入力データを、このデータに含まれる連結成分の大きさと第１の値との比較結果に基づいて、文字データ及び文字データ以外のデータに分類する第１の分類手段と、この第１の分類手段が分類した文字データに含まれる黒画素の連結成分の特徴量に基づいて、前記第１の縮小手段によって縮小された入力データが正しく分類されているか否かを判定しつつ、正しく分類されていると判定した文字データを文字データに対応するアルゴリズムで符号化する符号化手段と、この符号化手段が正しく分類されていないと判定した文字データ以外のデータを、前記第１の縮小率よりも低い第２の縮小率で縮小する第２の縮小手段と、この第２の縮小手段が縮小した文字データ以外のデータを、このデータに含まれる連結成分の大きさと前記第１の値とは異なる第２の値との比較結果に基づいて、文字データ及び文字データ以外のデータに分類する第２の分類手段とを有し、前記符号化手段は、前記第２の分類手段が分類した文字データをさらに符号化する
符号化装置にある。したがって、第１の縮小手段によって縮小された入力データは、正しく分類されている場合には、さらに分類することなく符号化することができるので、入力データを高速に符号化することができ、正しく分類されていない場合には、さらに分類した後に符号化するので、符号化精度の低下を防止することができる。 In order to achieve the above object, the first feature of the present invention is that the first reduction means for reducing the input data at the first reduction rate, and the input data reduced by the first reduction means, based on the comparison result of the size and the first value of the connected components contained in the data, the first classification means for classifying the data other than character data, and character data, the first character classification means classifies the based on the feature amount of the connected component of black pixels included in the data, while determining whether the first input data reduced by the reduction means are correctly classified, and determined to be correctly classified character encoding means for encoding algorithm corresponding data to the character data, the data other than the determined character data and the encoding means is not correctly classified, the lower than the first reduction ratio second contraction A second reduction means for reducing at a rate, and data other than the character data reduced by the second reduction means, and a second value different from the first value and the size of the connected component included in the data And a second classifying unit that classifies the data into data other than character data and character data , and the encoding unit further encodes the character data classified by the second classifying unit. It is in the encoding device. Accordingly, the input data reduced by the first reduction unit, when it is correctly classified, since it is possible to encode without further classification, can be encoded into a high speed input data correctly If not classified, encoding is performed after further classification, so that it is possible to prevent a decrease in encoding accuracy .

好適には、前記第１の分類手段は、前記第２の分類手段よりも少ない処理量で所定量のデータを分離する。 Preferably, the first classification unit separates a predetermined amount of data with a smaller processing amount than the second classification unit.

また、本発明の第２の特徴とするところは、入力画像を第１の縮小率で縮小する第１の縮小手段と、この第１の縮小手段が縮小した入力データを、このデータに含まれる連結成分の大きさと第１の値との比較結果に基づいて、文字データ及び文字データ以外のデータに分類する第１の分類手段と、この第１の分類手段が分類した文字データに含まれる黒画素の連結成分の特徴量に基づいて、前記第１の縮小手段によって縮小された入力データが正しく分類されているか否かを判定しつつ、正しく分類されていると判定した文字データを文字データに対応するアルゴリズムで符号化する符号化手段と、この符号化手段が正しく分類されていないと判定した文字データ以外のデータを、前記第１の縮小率よりも低い第２の縮小率で縮小する第２の縮小手段と、この第２の縮小手段が縮小した文字データ以外のデータを、このデータに含まれる連結成分の大きさと前記第１の値とは異なる第２の値との比較結果に基づいて、文字データ及び文字データ以外のデータに分類する第２の分類手段とを有し、前記符号化手段は、前記第２の分類手段が分類した文字データをさらに符号化する画像処理装置にある。 The second feature of the present invention is that the data includes first reduction means for reducing an input image at a first reduction ratio and input data reduced by the first reduction means. Based on the comparison result between the size of the connected component and the first value, first classification means for classifying the data into data other than character data and character data, and black included in the character data classified by the first classification means Based on the feature quantity of the connected component of the pixels, it is determined whether the input data reduced by the first reduction means is correctly classified, and the character data determined to be correctly classified is character data. Encoding means for encoding with a corresponding algorithm and data other than character data determined that the encoding means is not correctly classified are reduced at a second reduction ratio lower than the first reduction ratio. 2 And small unit, the data other than the second character data reduction means is reduced, based on a result of comparison between different second values of the magnitude and the first value of the connected components contained in the data, And a second classification unit that classifies the character data and data other than the character data, and the encoding unit is in an image processing apparatus that further encodes the character data classified by the second classification unit.

また、好適には、前記第１の分類手段は、文字データ、ハーフトーンデータ、ＪＢＩＧ２の一般領域としての符号化に適さないデータ及びその他のデータの組み合わせを含むいずれか２つ以上のデータに入力画像を分離する。 Preferably, the first classification means inputs any two or more data including a combination of character data, halftone data, data not suitable for encoding as a general area of JBIG2, and other data. Separate images .

また、本発明の第３の特徴とするところは、入力データを第１の縮小率で縮小し、前記第１の縮小率で縮小した入力データを、このデータに含まれる連結成分の大きさと第１の値との比較結果に基づいて、文字データ及び文字データ以外のデータに分類し、分類した文字データに含まれる黒画素の連結成分の特徴量に基づいて、前記第１の縮小率で縮小された入力データが正しく分類されているか否かを判定しつつ、正しく分類されていると判定した文字データを文字データに対応するアルゴリズムで符号化し、正しく分類されていないと判定した文字データ以外のデータを、前記第１の縮小率よりも低い第２の縮小率で縮小し、前記第２の縮小率で縮小した文字データ以外のデータを、このデータに含まれる連結成分の大きさと前記第１の値とは異なる第２の値との比較結果に基づいて、文字データ及び文字データ以外のデータに分類し、分類した文字データを符号化する符号化方法にある。 The third feature of the present invention is that the input data is reduced at the first reduction ratio, and the input data reduced at the first reduction ratio is compared with the size of the connected component included in the data and Based on the comparison result with the value of 1, the data is classified into character data and data other than the character data, and reduced at the first reduction rate based on the feature amount of the connected component of the black pixels included in the classified character data. The character data determined to be correctly classified are encoded with an algorithm corresponding to the character data while determining whether the input data is correctly classified , and other than character data determined not to be correctly classified The data is reduced at a second reduction rate lower than the first reduction rate, and data other than the character data reduced at the second reduction rate is used as the size of the connected component included in the data and the first data. of Based on different second value as the comparison result of the, classified into character data and character other than the data data and the character data is classified into coding method for coding.

また、本発明の第４の特徴とするところは、入力データを第１の縮小率で縮小するステップと、前記第１の縮小率で縮小した入力データを、このデータに含まれる連結成分の大きさと第１の値との比較結果に基づいて、文字データ及び文字データ以外のデータに分類するステップと、分類した文字データに含まれる黒画素の連結成分の特徴量に基づいて、前記第１の縮小率で縮小された入力データが正しく分類されているか否かを判定しつつ、正しく分類されていると判定した文字データを文字データに対応するアルゴリズムで符号化するステップと、正しく分類されていないと判定した文字データ以外のデータを、前記第１の縮小率よりも低い第２の縮小率で縮小するステップと、前記第２の縮小率で縮小した文字データ以外のデータを、このデータに含まれる連結成分の大きさと前記第１の値とは異なる第２の値との比較結果に基づいて、文字データ及び文字データ以外のデータに分類するステップと、分類した文字データを符号化するステップとをコンピュータに実行させる符号化プログラムにある。 According to a fourth feature of the present invention, the step of reducing the input data at the first reduction ratio and the input data reduced at the first reduction ratio are compared with the magnitude of the connected component included in the data. On the basis of the comparison result between the first value and the character data, the step of classifying the data into data other than character data and character data, and the feature value of the connected component of black pixels included in the classified character data. A step of encoding character data determined to be correctly classified with an algorithm corresponding to the character data while determining whether or not the input data reduced by the reduction ratio is correctly classified and not correctly classified and data other than character data is determined, a step of reduction in the second reduction ratio lower than the first reduction ratio, the data other than character data reduced by the second reduction ratio, Based on the result of comparison between the second value that is different from the size and the first value of the connected components contained in the data, the step of classifying the non-character data and character data data, text data classified code And an encoding program for causing a computer to execute the step of converting.

本発明によれば、符号化精度を低下させることなく、入力データを高速に符号化することができる。 According to the present invention, input data can be encoded at high speed without reducing the encoding accuracy.

次に本発明の実施形態を図面に基づいて説明する。
図１において、本発明の実施形態に係る画像処理装置１の概要が示されている。画像処理装置１は、表示装置及びキーボードなどを含むユーザインタフェース装置（ＵＩ装置）１０、ＨＤＤ・ＣＤ装置などの記憶装置１２、印刷装置１４、通信装置１６及び制御装置２などから構成される。制御装置２は、ＣＰＵ２０及びメモリ２２などを含み、画像処理装置１を構成する各部を制御する。
つまり、画像処理装置１は、コンピュータとしての機能を含み、後述する符号化プログラム３を実行することにより、記憶媒体１２０又は通信装置１６を介して受け入れた画像データ（入力画像）を符号化し、例えば記憶装置１２に記憶するようにされている。 Next, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 shows an outline of an image processing apparatus 1 according to an embodiment of the present invention. The image processing apparatus 1 includes a user interface device (UI device) 10 including a display device and a keyboard, a storage device 12 such as an HDD / CD device, a printing device 14, a communication device 16, and a control device 2. The control device 2 includes a CPU 20 and a memory 22 and controls each unit constituting the image processing device 1.
That is, the image processing apparatus 1 includes a function as a computer, encodes image data (input image) received via the storage medium 120 or the communication device 16 by executing an encoding program 3 to be described later, for example, The information is stored in the storage device 12.

図２乃至図５において、画像処理装置１により実行される符号化プログラム３の構成が示されている。
図２に示すように、符号化プログラム３は、領域粗分離部３０、テキスト符号化部３２、一般符号化部３４及び領域詳細分離部３６を有する。 2 to 5, the configuration of the encoding program 3 executed by the image processing apparatus 1 is shown.
As shown in FIG. 2, the encoding program 3 includes a region rough separation unit 30, a text encoding unit 32, a general encoding unit 34, and a region detail separation unit 36.

図３は、領域粗分離部３０の構成を示すブロック図である。
図３に示すように、領域粗分離部３０は、第１の縮小処理部３００、連結成分抽出部３０２及び第１の分類部３０４から構成され、例えば“Dave A.D. Tompkins, Faouzi Kossentini. A Fast Segmentation Algorithm for Bi-Level Image Compression using JBIG2. Proceedings of the 1999 IEEE International Conference on Image Processing (ICIP), 1999”（：文献Ａ）に記載された方法で入力画像をテキスト領域とその他の領域とに粗く分離する（粗分離）。 FIG. 3 is a block diagram showing the configuration of the area rough separation unit 30.
As shown in FIG. 3, the region rough separation unit 30 includes a first reduction processing unit 300, a connected component extraction unit 302, and a first classification unit 304. For example, “Dave AD Tompkins, Faouzi Kossentini. A Fast Segmentation” Algorithm for Bi-Level Image Compression using JBIG2. Proceedings of the 1999 IEEE International Conference on Image Processing (ICIP), 1999 ”(: Reference A) The input image is roughly separated into a text area and other areas. (Coarse separation).

第１の縮小処理部３００は、受け入れた入力画像を所定の縮小率で縮小し、連結成分抽出部３０２に対して出力する。 The first reduction processing unit 300 reduces the received input image at a predetermined reduction rate and outputs it to the connected component extraction unit 302.

連結成分抽出部３０２は、第１の縮小処理部３００から入力される画像の着目画素と８−近傍画素との８−連結成分をそれぞれ抽出し、第１の分類部３０４に対して出力する。 The connected component extraction unit 302 extracts 8-connected components of the target pixel and 8-neighboring pixels of the image input from the first reduction processing unit 300, and outputs them to the first classification unit 304.

第１の分類部３０４は、連結成分抽出部３０２から入力される８−連結成分の大きさを所定の閾値（第１分類閾値）と比較し、８−連結成分の大きさが所定の閾値よりも小さい場合には、着目画素をテキスト領域としてテキスト符号化部３２に対して出力し、８−連結成分の大きさが所定の閾値以上の場合には、着目画素をその他の領域として一般符号化部３４に対して出力する。 The first classification unit 304 compares the 8-connected component size input from the connected component extraction unit 302 with a predetermined threshold (first classification threshold), and the 8-connected component size is greater than the predetermined threshold. Is smaller, the target pixel is output as a text region to the text encoding unit 32, and when the 8-connected component size is equal to or larger than a predetermined threshold, the target pixel is used as another region for general encoding. To the unit 34.

例えば、図６に示した上記文献Ａに記載されている縮小画像例のように、領域粗分離部３０は、入力画像を縮小した後にテキスト領域とその他の領域とに分離して出力する。 For example, as in the reduced image example described in the document A shown in FIG. 6, the area rough separation unit 30 reduces the input image and then outputs it after separating it into a text area and other areas.

図４は、テキスト符号化部３２の構成を示すブロック図である。
図４に示すように、テキスト符号化部３２は、シンボル抽出部３２０、特徴量抽出部３２２、非テキストシンボル除去部３２４、辞書作成部３２６及び符号化処理部３２８から構成される。 FIG. 4 is a block diagram showing a configuration of the text encoding unit 32.
As shown in FIG. 4, the text encoding unit 32 includes a symbol extraction unit 320, a feature amount extraction unit 322, a non-text symbol removal unit 324, a dictionary creation unit 326, and an encoding processing unit 328.

シンボル抽出部３２０は、領域粗分離部３０からテキスト領域を受け入れ、黒画素の連結成分を抽出し、特徴量抽出部３２２に対して出力する。
また、シンボル抽出部３２０は、後述する領域詳細分離部３６から入力されるテキスト領域に対しても同様に、黒画素の連結成分を抽出し、特徴量抽出部３２２に対して出力する。
以下、シンボル抽出部３２０が抽出した黒画素の連結成分をシンボルと記す。 The symbol extraction unit 320 receives the text region from the region rough separation unit 30, extracts the connected components of black pixels, and outputs them to the feature amount extraction unit 322.
Similarly, the symbol extraction unit 320 also extracts a black pixel connected component from a text region input from the region detail separation unit 36 described later, and outputs it to the feature amount extraction unit 322.
Hereinafter, the connected components of black pixels extracted by the symbol extraction unit 320 are referred to as symbols.

特徴量抽出部３２２は、シンボル抽出部３２０から入力される各シンボルの特徴量を抽出（算出）し、非テキストシンボル除去部３２４に対して出力する。特徴量抽出部３２２が抽出する特徴量は、例えばシンボルの縦横サイズ及び穴の数などである。 The feature amount extraction unit 322 extracts (calculates) the feature amount of each symbol input from the symbol extraction unit 320 and outputs it to the non-text symbol removal unit 324. The feature quantity extracted by the feature quantity extraction unit 322 is, for example, the vertical and horizontal size of the symbol and the number of holes.

非テキスト（文字）シンボル除去部３２４は、特徴量抽出部３２２から入力されるシンボルの特徴量を所定の閾値と比較して、所定の閾値から外れた特徴量を有するシンボルを非テキストシンボルの領域と判定し、領域詳細分離部３６に対して出力する。また、非テキストシンボル除去部３２４は、特徴量抽出部３２２から入力されるシンボルの特徴量が所定の閾値内にある場合には、シンボルがテキスト領域として正しく判定されているとみなして辞書作成部３２６に対して出力する。
つまり、領域粗分離部３０が領域を正しく分離していない場合には、非テキストシンボル除去部３２４は、非テキストシンボルの領域をテキスト領域から除去する。 The non-text (character) symbol removing unit 324 compares the feature amount of the symbol input from the feature amount extracting unit 322 with a predetermined threshold value, and determines a symbol having a feature amount outside the predetermined threshold value as a non-text symbol region. Is output to the region detail separation unit 36. Further, the non-text symbol removal unit 324 determines that the symbol is correctly determined as a text area when the feature amount of the symbol input from the feature amount extraction unit 322 is within a predetermined threshold, and the dictionary creation unit 326 for output.
That is, when the region rough separation unit 30 does not correctly separate the regions, the non-text symbol removal unit 324 removes the region of the non-text symbols from the text region.

辞書作成部３２６は、ＪＢＩＧ（Joint Bi-Level Image Expert Group）が作成したＪＢＩＧ２Ａｍｄ．１相当の機能を有し、非テキストシンボル除去部３２４から入力されるシンボルを相互に比較し、同一とみなせないシンボルを辞書に追加することにより、シンボル辞書を作成して符号化処理部３２８に対して出力する。
また、辞書作成部３２６は、シンボル相互の特徴量を比較することにより、シンボルを比較する処理を高速化している。ここで、辞書作成部３２６は、抽出（算出）済みの特徴量を利用してシンボルを比較するようにされている。 The dictionary creation unit 326 is configured by JBIG2 Amd. JBIG (Joint Bi-Level Image Expert Group). 1, the symbols input from the non-text symbol removal unit 324 are compared with each other, and symbols that cannot be regarded as the same are added to the dictionary, thereby creating a symbol dictionary and adding it to the encoding processing unit 328. Output.
Further, the dictionary creation unit 326 increases the speed of the process of comparing symbols by comparing the feature values of the symbols. Here, the dictionary creation unit 326 compares symbols using the extracted (calculated) feature quantities.

符号化処理部３２８は、例えばＪＢＩＧ２のテキスト領域符号化手順に従い、辞書作成部３２６から入力されるシンボル辞書を用いてテキスト領域（Text region）を符号化（テキスト符号化）し、記憶装置１２などに対して出力する。 The encoding processing unit 328 encodes (text encodes) a text region using the symbol dictionary input from the dictionary creation unit 326 according to, for example, the text region encoding procedure of JBIG2, and stores the storage device 12 or the like. Output for.

このように、テキスト符号化部３２は、例えば非テキストシンボル除去部３２４が明らかに文字でないシンボル（サイズが閾値よりも大きい、又は穴の数が閾値よりも多いなど）をテキスト領域から除去することにより、辞書作成部３２６が作成する辞書サイズを小さくし、シンボル辞書作成処理時間の削減、及び入力画像の圧縮率の向上を実現する。
また、テキスト符号化部３２は、領域詳細分離部３６から入力されるテキスト領域に対しても符号化を行うので、入力画像の符号化精度を向上させている。 In this way, the text encoding unit 32 removes, for example, a symbol that is clearly not a character (for example, the size is larger than the threshold or the number of holes is larger than the threshold) from the text region. Thus, the size of the dictionary created by the dictionary creation unit 326 is reduced, and the symbol dictionary creation processing time is reduced and the compression rate of the input image is improved.
Further, since the text encoding unit 32 also encodes the text region input from the region detail separation unit 36, the encoding accuracy of the input image is improved.

一般符号化部３４は、領域粗分離部３０から入力されるその他の領域を、例えばＪＢＩＧ２の一般領域（Generic region）として符号化（一般符号化）し、記憶装置１２などに対して出力する。
また、一般符号化部３４は、後述する領域詳細分離部３６から入力されるその他の領域に対しても、領域粗分離部３０から入力されるその他の領域と同様に符号化するようにされている。 The general encoding unit 34 encodes (general encoding) the other region input from the region rough separation unit 30 as, for example, a general region (generic region) of JBIG2, and outputs the encoded region to the storage device 12 or the like.
Further, the general encoding unit 34 encodes other regions input from the region detail separation unit 36 described later in the same manner as other regions input from the region coarse separation unit 30. Yes.

図５は、領域詳細分離部３６の構成を示すブロック図である。
図５に示すように、領域詳細分離部３６は、第２の縮小処理部３６０、連結成分抽出部３６２及び第２の分類部３６４から構成され、領域粗分離部３０とは異なる設定（条件）の領域粗分離部３０とほぼ同じ方法で、領域粗分離部３０が分離を誤った領域をテキスト領域とその他の領域とに詳細に分離する（詳細分離）。 FIG. 5 is a block diagram illustrating a configuration of the region detail separation unit 36.
As shown in FIG. 5, the region detail separation unit 36 includes a second reduction processing unit 360, a connected component extraction unit 362, and a second classification unit 364, and has different settings (conditions) from the region rough separation unit 30. The region coarse separation unit 30 separates the erroneously separated region into a text region and other regions in detail in substantially the same manner as the region coarse separation unit 30 (detail separation).

第２の縮小処理部３６０は、例えば入力画像を参照しつつ、テキスト符号化部３２から受け入れた非文字シンボルの領域を、第１の縮小処理部３００よりも低い（縮小後の画像サイズが第１の縮小処理部３００よりも大きくなる）縮小率で縮小し、連結成分抽出部３６２に対して出力する。 The second reduction processing unit 360 refers to the input image, for example, and lowers the non-character symbol area received from the text encoding unit 32 than the first reduction processing unit 300 (the image size after reduction is the first size). 1, which is larger than the first reduction processing unit 300) and output to the connected component extraction unit 362.

連結成分抽出部３６２は、第２の縮小処理部３６０から入力される画像の着目画素と８−近傍画素との８−連結成分をそれぞれ抽出し、第２の分類部３６４に対して出力する。 The connected component extraction unit 362 extracts 8-connected components of the target pixel and 8-neighboring pixels of the image input from the second reduction processing unit 360, and outputs them to the second classification unit 364.

第２の分類部３６４は、連結成分抽出部３６２から入力される８−連結成分の大きさと、第１の分類部３０４の閾値（第１分類閾値）とは異なる閾値（第２分類閾値）とを比較し、８−連結成分の大きさが第２分類閾値よりも小さい場合には、着目画素をテキスト領域としてテキスト符号化部３２に対して出力し、８−連結成分の大きさが第２分類閾値以上の場合には、着目画素をその他の領域として一般符号化部３４に対して出力する。 The second classification unit 364 has a threshold (second classification threshold) different from the magnitude of the 8-connected component input from the connected component extraction unit 362 and the threshold (first classification threshold) of the first classification unit 304. If the magnitude of the 8-connected component is smaller than the second classification threshold, the pixel of interest is output as a text area to the text encoding unit 32, and the magnitude of the 8-connected component is the second. If it is equal to or higher than the classification threshold, the target pixel is output to the general encoding unit 34 as another region.

次に、画像処理装置１が符号化プログラム３を実行することにより、入力画像を符号化する処理について説明する。
図７は、画像処理装置１が符号化プログラム３を実行することにより、入力画像を符号化する処理（Ｓ１０）を示すフローチャートである。
図７に示すように、ステップ１００（Ｓ１００）において、領域粗分離部３０は、入力画像をテキスト領域とその他の領域とに粗く分離する（粗分離）。 Next, processing for encoding an input image by the image processing apparatus 1 executing the encoding program 3 will be described.
FIG. 7 is a flowchart showing a process (S10) of encoding an input image by the image processing apparatus 1 executing the encoding program 3.
As shown in FIG. 7, in step 100 (S100), the area rough separation unit 30 roughly separates the input image into a text area and other areas (rough separation).

ステップ１０２（Ｓ１０２）において、テキスト符号化部３２は、領域粗分離部３０による領域の粗分離が正しいか否かを、領域ごとに非テキストシンボル除去部３２４によって判定し、領域が正しく分離されている場合にはＳ１０６の処理に進み、領域が正しく判定されていない場合にはＳ１０４の処理に進む。 In step 102 (S102), the text encoding unit 32 determines whether or not the rough separation of the region by the region rough separation unit 30 is correct by the non-text symbol removal unit 324 for each region, and the region is correctly separated. If YES in step S106, the process advances to step S106. If the area is not correctly determined, the process advances to step S104.

ステップ１０４（Ｓ１０４）において、領域詳細分離部３６は、領域粗分離部３０が分離を誤った領域をテキスト領域とその他の領域とに詳細に分離する（詳細分離）。 In step 104 (S104), the region detail separation unit 36 separates the region in which the region rough separation unit 30 erroneously separated into a text region and other regions in detail (detail separation).

ステップ１０６（Ｓ１０６）において、テキスト符号化部３２は、領域粗分離部３０が分離を誤った領域を領域詳細分離部３６が再分離した結果を含むテキスト領域をテキスト符号化する。 In step 106 (S106), the text encoding unit 32 encodes the text region including the result of the region detailed separation unit 36 re-separating the region in which the region rough separation unit 30 has erroneously separated.

ステップ１０８（Ｓ１０８）において、一般符号化部３４は、領域粗分離部３０が分離を誤った領域を領域詳細分離部３６が再分離した結果を含むその他の領域を一般符号化する。 In step 108 (S108), the general encoding unit 34 performs general encoding on the other regions including the result of the region detailed separation unit 36 re-separating the region in which the region rough separation unit 30 has erroneously separated.

また、上記実施形態においては、符号化プログラム３について、入力画像をテキスト領域とその他の領域とに分離する場合を例に説明したが、これに限定されることなく、例えば符号化プログラム３は、テキスト符号化部３２及び一般符号化部３４以外に、さらに１つ以上の符号化部を有する構成であってもよいし、ハーフトーン領域（Halftone region）などを分離して、ＪＢＩＧ２に対応する領域それぞれに入力画像を分離して符号化するものであってもよい。
さらに、例えば符号化プログラム３が３つ以上の符号化部を有する場合、一般符号化部３４による符号化が適していない領域を一般符号化部３４が除去して第２の縮小処理部３６０に対して出力し、領域詳細分離部３６が受け入れた領域それぞれを詳細に分離するようにされてもよい。 Moreover, in the said embodiment, although the case where the input image was isolate | separated into a text area | region and another area | region was demonstrated to the encoding program 3 as an example, it is not limited to this, For example, the encoding program 3 is: In addition to the text encoding unit 32 and the general encoding unit 34, the configuration may further include one or more encoding units, or a region corresponding to JBIG2 by separating a halftone region or the like The input image may be separated and encoded.
Furthermore, for example, when the encoding program 3 has three or more encoding units, the general encoding unit 34 removes an area that is not suitable for encoding by the general encoding unit 34 and the second reduction processing unit 360 On the other hand, each region received by the region detail separation unit 36 may be separated in detail.

このように、画像処理装置１は、領域粗分離部３０が粗分離した結果をテキスト符号化部３２が符号化する場合に、領域粗分離部３０による分離が誤っている領域を領域詳細分離部３６が詳細分離するので、領域詳細分離部３６が詳細に分離する領域の量を削減することができるとともに、テキスト符号化部３２が不要なパターンマッチングを行うことを防止することができ、シンボル辞書の無駄防止、圧縮率の向上、符号化処理速度の向上及び画質の向上を実現することができる。 As described above, when the text encoding unit 32 encodes the result of the rough separation by the region rough separation unit 30, the image processing device 1 identifies a region in which the separation by the region rough separation unit 30 is erroneous as a region detailed separation unit. Since 36 is separated in detail, it is possible to reduce the amount of the region that the region detail separating unit 36 separates in detail, and it is possible to prevent the text encoding unit 32 from performing unnecessary pattern matching. Can be prevented, the compression rate can be improved, the encoding processing speed can be improved, and the image quality can be improved.

本発明の実施形態に係る画像処理装置の概要を示す構成図である。1 is a configuration diagram illustrating an overview of an image processing apparatus according to an embodiment of the present invention. 画像処理装置により実行される符号化プログラムの構成を示すプログラム構成図である。It is a program block diagram which shows the structure of the encoding program performed with an image processing apparatus. 領域粗分離部の構成を示すブロック図である。It is a block diagram which shows the structure of an area | region rough separation part. テキスト符号化部の構成を示すブロック図である。It is a block diagram which shows the structure of a text encoding part. 領域詳細分離部の構成を示すブロック図である。It is a block diagram which shows the structure of an area | region detailed separation part. A Fast Segmentation Algorithm for Bi-Level Image Compression using JBIG2（文献Ａ）に記載の縮小画像例である。This is a reduced image example described in A Fast Segmentation Algorithm for Bi-Level Image Compression using JBIG2 (Reference A). 画像処理装置が符号化プログラムを実行することにより、入力画像を符号化する処理（Ｓ１０）を示すフローチャートである。It is a flowchart which shows the process (S10) which encodes an input image, when an image processing apparatus runs an encoding program.

Explanation of symbols

１・・・画像処理装置
１２・・・記憶装置
１６・・・通信装置
１２０・・・記憶媒体
２・・・制御装置
２０・・・ＣＰＵ
２２・・・メモリ
３・・・符号化プログラム
３０・・・領域粗分離部
３００・・・第１の縮小処理部
３０２・・・連結成分抽出部
３０４・・・第１の分類部
３２・・・テキスト符号化部
３２０・・・シンボル抽出部
３２２・・・特徴量抽出部
３２４・・・非テキストシンボル除去部
３２６・・・辞書作成部
３２８・・・符号化処理部
３４・・・一般符号化部
３６・・・領域詳細分離部
３６０・・・第２の縮小処理部
３６２・・・連結成分抽出部
３６４・・・第２の分類部 DESCRIPTION OF SYMBOLS 1 ... Image processing device 12 ... Storage device 16 ... Communication device 120 ... Storage medium 2 ... Control device 20 ... CPU
DESCRIPTION OF SYMBOLS 22 ... Memory 3 ... Encoding program 30 ... Area rough separation part 300 ... 1st reduction process part 302 ... Connected component extraction part 304 ... 1st classification | category part 32 ... Text encoding unit 320 ... Symbol extraction unit 322 ... Feature amount extraction unit 324 ... Non-text symbol removal unit 326 ... Dictionary creation unit 328 ... Encoding processing unit 34 ... General code 36: Area detail separating unit 360: Second reduction processing unit 362: Connected component extracting unit 364: Second classification unit

Claims

First reduction means for reducing input data at a first reduction rate;
The input data which the first reduction unit is reduced, based on the comparison result of the size and the first value of the connected components contained in the data, the first classification to classify the data other than character data, and character data Means,
While determining whether the input data reduced by the first reduction means is correctly classified based on the feature amount of the connected component of the black pixels included in the character data classified by the first classification means. Encoding means for encoding character data determined to be correctly classified with an algorithm corresponding to the character data ;
Second reduction means for reducing data other than character data determined not to be correctly classified by the encoding means at a second reduction rate lower than the first reduction rate;
Based on the comparison result between the data other than the character data reduced by the second reduction means and the second value different from the first value, the size of the connected component included in the data and the character data and the character data A second classification means for classifying data other than data ,
The encoding unit further encodes the character data classified by the second classification unit.

The encoding apparatus according to claim 1, wherein the first classification unit separates a predetermined amount of data with a smaller processing amount than the second classification unit.

First reduction means for reducing the input image at a first reduction rate;
The input data which the first reduction unit is reduced, based on the comparison result of the size and the first value of the connected components contained in the data, the first classification to classify the data other than character data, and character data Means,
While determining whether the input data reduced by the first reduction means is correctly classified based on the feature amount of the connected component of the black pixels included in the character data classified by the first classification means. Encoding means for encoding character data determined to be correctly classified with an algorithm corresponding to the character data ;
Second reduction means for reducing data other than character data determined not to be correctly classified by the encoding means at a second reduction rate lower than the first reduction rate;
Based on the comparison result between the data other than the character data reduced by the second reduction means and the second value different from the first value, the size of the connected component included in the data and the character data and the character data A second classification means for classifying data other than data ,
The image processing apparatus , wherein the encoding means further encodes the character data classified by the second classification means.

Said first classifying means, the image processing apparatus according to claim 3, wherein separating the predetermined amount of data in less processing than the second classifying means.

The first classification means separates an input image into any two or more data including a combination of character data, halftone data, data not suitable for encoding as a general area of JBIG2, and other data. 3. The image processing apparatus according to 3 or 4 .

Reduce the input data by the first reduction rate,
Classifying the input data reduced at the first reduction rate into character data and data other than character data based on a comparison result between the size of the connected component included in the data and the first value ;
Based on the feature amount of the connected component of black pixels included in the classified character data, while determining whether the first input data reduced by the reduction ratio is correctly classified and are correctly classified the determined character data encoded with an algorithm corresponding to the character data,
Reducing data other than character data determined not to be correctly classified at a second reduction ratio lower than the first reduction ratio;
Based on the comparison result between the data other than the character data reduced at the second reduction ratio and the size of the connected component included in the data and the second value different from the first value , the character data and the character Classify it into data other than data ,
An encoding method that encodes classified character data .

Reducing the input data at a first reduction rate;
Classifying the input data reduced at the first reduction ratio into character data and data other than character data based on a comparison result between the size of the connected component included in the data and the first value ;
Based on the feature amount of the connected component of black pixels included in the classified character data, while determining whether the first input data reduced by the reduction ratio is correctly classified and are correctly classified a step of encoding algorithm corresponding to the determined character data into character data,
Reducing data other than character data determined not to be correctly classified at a second reduction rate lower than the first reduction rate;
Based on the comparison result between the data other than the character data reduced at the second reduction ratio and the size of the connected component included in the data and the second value different from the first value , the character data and the character Categorizing into non-data data ,
An encoding program for causing a computer to execute the step of encoding the classified character data .