JP6545592B2

JP6545592B2 - Image processing apparatus and image processing program

Info

Publication number: JP6545592B2
Application number: JP2015191308A
Authority: JP
Inventors: 真一尾関
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2015-09-29
Filing date: 2015-09-29
Publication date: 2019-07-17
Anticipated expiration: 2035-09-29
Also published as: JP2017068433A

Description

本発明は、画像処理装置及び画像処理プログラムに関し、特に原稿に記載されている文字の種類を判別する技術に関する。 The present invention relates to an image processing apparatus and an image processing program, and more particularly to a technique for determining the type of a character described in a document.

複写機や複合機等の画像形成装置において、画像入力装置で読み取られた原稿（以下、「読取り原稿」と称す。）がどのような種類の原稿であるのかを判別することは、原稿の種類に応じた好ましい画像処理を行う上で重要である。特に、読取り原稿が文字原稿であった場合、その原稿に記載されている文字が手書き文字であるのか否かを判定することが重要である。なぜなら、手書き文字であった場合、読取り原稿の画像データに対して膨張処理を施すことが、画像での文字の掠れ等を低減させる上で必要となるからである。 In an image forming apparatus such as a copying machine or a multi-function machine, it is necessary to determine what type of original the original read by the image input device (hereinafter referred to as “read original”) is. It is important to perform preferable image processing according to In particular, when the read document is a text document, it is important to determine whether the characters described in the document are handwritten characters. This is because, in the case of handwritten characters, it is necessary to perform expansion processing on image data of a read document in order to reduce blurring of characters and the like in the image.

これまでにも、原稿に記載されている文字が手書き文字であるのか否かを判定する技術は提案されている（例えば、特許文献１参照）。具体的には、原稿に記載されている文字が手書き文字又は活字文字であることの確からしさを数値化し、その数値に基づいて判定を行う。そして、その数値には、画像内の文字の階調分布を解析して得られるもの、画像内の文字の色を解析して得られもの、画像内の文字のストロークに基づいた解析を行うことにより得られるもの、画像内の文字の位置やサイズを解析して得られるもの等、様々な数値が含まれる。 Until now, there has been proposed a technique for determining whether a character described in a document is a handwritten character (see, for example, Patent Document 1). Specifically, the certainty that the character described in the document is a handwritten character or a print character is quantified, and determination is made based on the numerical value. Then, the numerical value may be obtained by analyzing the gradation distribution of the characters in the image, one obtained by analyzing the color of the characters in the image, or an analysis based on the strokes of the characters in the image. And various numerical values such as those obtained by analyzing the position and size of characters in an image.

国際公開第２０１１／０７４０６７号International Publication No. 2011/0174067

しかしながら、提案されている上記判定技術では、様々な解析に基づいた数値化を行う必要があり、従って複雑な処理が必要となる。この様な複雑な処理は、時間を要するため、画像の入力から出力までの時間を短縮したいという近年の要望に反するものである。 However, in the above-mentioned judgment technique that has been proposed, it is necessary to perform numerical analysis based on various analyses, and thus complicated processing is required. Such complicated processing is time consuming and thus contrary to the recent demand for shortening the time from input to output of an image.

そこで本発明の目的は、原稿に記載されている文字が手書き文字であるのか否かを、簡単な処理であるにも拘らず高い精度で判定することが可能な画像処理装置及び画像処理プログラムを提供することである。 Therefore, an object of the present invention is to provide an image processing apparatus and an image processing program capable of determining with high accuracy whether or not a character described in an original document is a handwritten character, in spite of simple processing. It is to provide.

本発明に係る画像処理装置は、文字認識処理部と、認識文字計数部と、判定部とを備える。文字認識処理部は、原稿の画像データに含まれる文字を認識する。認識文字計数部は、文字認識処理部にて認識された文字の数をカウントすることにより、認識された文字の総数（認識文字数）を算出する。判定部は、認識文字計数部にて算出された総数（認識文字数）に基づき、原稿に記載されている文字が手書き文字であるか否かを判定する。 An image processing apparatus according to the present invention includes a character recognition processing unit, a recognized character counting unit, and a determination unit. The character recognition processing unit recognizes characters included in image data of a document. The recognition character counting unit counts the number of characters recognized by the character recognition processing unit to calculate the total number of recognized characters (the number of recognized characters). The determination unit determines, based on the total number (the number of recognized characters) calculated by the recognized character counting unit, whether or not the character described in the document is a handwritten character.

認識文字数は、活字文字の原稿に比べて手書き文字の原稿の方が、著しく小さくなる傾向にある。なぜなら、文字を書く人によって文字の形態は様々であり、従って、文字認識処理部での文字認識の精度が低下するからである。そして、この様な傾向に基づいた判定が、判定部にて実行される。 The number of recognized characters tends to be significantly smaller in the manuscript of handwritten characters than in the manuscript of print characters. The reason is that the character form varies depending on the person writing the character, and therefore, the accuracy of character recognition in the character recognition processing unit is reduced. Then, determination based on such a tendency is performed by the determination unit.

上記画像処理装置の具体的な構成において、画像処理装置は、エッジ抽出部と、エッジ計数部と、第１比較部とを更に備える。エッジ抽出部は、画像データに含まれる画像のエッジを構成するエッジ画素を抽出する。エッジ計数部は、エッジ抽出部にて抽出されたエッジ画素の数をカウントすることにより、エッジ画素の総数（エッジ画素数）を算出する。第１比較部は、認識文字計数部にて算出された総数（認識文字数）に対するエッジ画素の総数（エッジ画素数）の比である第１比率が第１閾値より大きいか否かを判断する。そして、判定部は、第１比較部にて第１比率が第１閾値より大きいと判断されたとき、原稿に記載されている文字は手書き文字であると判定する。 In the specific configuration of the image processing apparatus, the image processing apparatus further includes an edge extraction unit, an edge counting unit, and a first comparison unit. The edge extraction unit extracts edge pixels that constitute the edge of the image included in the image data. The edge counting unit calculates the total number of edge pixels (the number of edge pixels) by counting the number of edge pixels extracted by the edge extraction unit. The first comparison unit determines whether a first ratio, which is a ratio of the total number of edge pixels (number of edge pixels) to the total number (number of recognized characters) calculated by the recognized character counting unit, is greater than a first threshold. When the first comparison unit determines that the first ratio is greater than the first threshold, the determination unit determines that the character described in the document is a handwritten character.

手書き文字の原稿と活字文字の原稿とでは、エッジ画素数には、あまり大きな差が生じない。一方、認識文字数は、活字文字の原稿に比べて手書き文字の原稿の方が、著しく小さくなる傾向にある。従って、第１比率は、活字文字の原稿に比べて手書き文字の原稿の方が、著しく大きくなる傾向にある。そして、この様な傾向に基づき第１閾値が適切な値に設定されることにより、第１比率と第１閾値との大小関係に基づいた判定が実現される。 There is no significant difference in the number of edge pixels between the handwritten character manuscript and the print character manuscript. On the other hand, the number of recognized characters tends to be significantly smaller in the manuscript of handwritten characters than in the manuscript of print characters. Therefore, the first ratio tends to be significantly larger in the manuscript of handwritten characters than in the manuscript of print characters. Then, by setting the first threshold to an appropriate value based on such a tendency, the determination based on the magnitude relationship between the first ratio and the first threshold is realized.

上記画像処理装置の他の具体的な構成において、画像処理装置は、連続文字計数部と、第２比較部とを更に備える。連続文字計数部は、文字認識処理部にて認識された文字のうち、サイズが同じであると判断され且つ所定個以上連続する文字の数をカウントすることにより、連続する文字の総数（連続文字数）を算出する。第２比較部は、認識文字計数部にて算出された総数（認識文字数）に対する連続する文字の総数（連続文字数）の比である第２比率が第２閾値より小さいか否かを判断する。判定部は、第２比較部にて第２比率が第２閾値より小さいと判断されたとき、原稿に記載されている文字は手書き文字であると判定する。 In another specific configuration of the image processing device, the image processing device further includes a continuous character counting unit and a second comparison unit. Continuous character counting section, of the recognized characters in the character recognition processing section, by size counts the number of characters you continuously be determined and more than a predetermined one and the same, the total number of consecutive characters (Continuous Calculate the number of characters). The second comparison unit determines whether a second ratio, which is a ratio of the total number of consecutive characters (number of consecutive characters) to the total number (number of recognized characters) calculated by the recognized character counting unit, is smaller than a second threshold. When the second comparison unit determines that the second ratio is smaller than the second threshold, the determination unit determines that the character described in the document is a handwritten character.

活字文字の原稿では文字の形態（フォントやサイズ）が統一されていること多いため、認識文字数は大きな値となり、且つ、連続文字数は認識文字数に近い値になり易い。従って、活字文字の原稿では、第２比率は１に近い値となる傾向にある。一方、手書き文字の原稿では、文字を書く人によって文字の形態は様々であり、又、文字のサイズが不揃いになり易い。このため、認識文字数は小さな値となり、且つ、連続文字数は更に小さな値となり易い。従って、手書き文字の原稿では、第２比率は０に近い値となる傾向にある。そして、この様な傾向に基づき第２閾値が適切な値に設定されることにより、第２比率と第２閾値との大小関係に基づいた判定が実現される。 Since the character form (font and size) is often unified in a print character document, the number of recognized characters is a large value, and the number of consecutive characters tends to be a value close to the number of recognized characters. Therefore, the second ratio tends to be a value close to 1 in the case of printed characters. On the other hand, in manuscripts of handwritten characters, the form of the characters varies depending on the person writing the characters, and the size of the characters tends to be irregular. For this reason, the recognition character number tends to be a small value, and the continuous character number tends to be a further smaller value. Therefore, the second ratio tends to be a value close to 0 in a manuscript of handwritten characters. Then, by setting the second threshold to an appropriate value based on such a tendency, the determination based on the magnitude relationship between the second ratio and the second threshold is realized.

本発明に係る他の画像処理装置は、エッジ抽出部と、直線検出部と、直線間隔算出部と、判定部とを備える。エッジ抽出部は、原稿の画像データに含まれる画像のエッジを構成するエッジ画素を抽出する。直線検出部は、エッジ抽出部にて抽出されたエッジ画素に基づき、画像データに含まれる直線を構成する画像のエッジに対応する直線を検出する。直線間隔算出部は、直線検出部にて検出された直線の間隔を算出する。判定部は、直線間隔算出部にて算出された間隔に基づき、原稿に記載されている文字が手書き文字であるか否かを判定する。 Another image processing apparatus according to the present invention includes an edge extraction unit, a straight line detection unit, a straight line interval calculation unit, and a determination unit. The edge extraction unit extracts edge pixels constituting the edge of the image included in the image data of the document. The straight line detection unit detects a straight line corresponding to the edge of the image forming the straight line included in the image data, based on the edge pixel extracted by the edge extraction unit. The straight line interval calculation unit calculates a straight line interval detected by the straight line detection unit. The determination unit determines whether the character described in the document is a handwritten character based on the interval calculated by the linear interval calculation unit.

直線間隔算出部にて算出された間隔が、既存の原稿に付されている罫線の間隔に一致する場合、原稿には罫線が付されている可能性が高い。そして、罫線が付された原稿には、文字が手書きされている可能性が高い。従って、判定部では、直線間隔算出部にて算出された間隔に基づいた判断が可能となる。 If the interval calculated by the linear interval calculation unit matches the interval between the ruled lines attached to the existing document, the document is highly likely to have a ruled line. Then, there is a high possibility that the characters are handwritten on the document having the ruled lines. Accordingly, the determination unit can make a determination based on the interval calculated by the linear interval calculation unit.

より具体的には、直線間隔算出部は、直線検出部にて検出された直線のうち長さが所定の長さ以上であるものを抽出し、抽出した直線について間隔を算出する。これにより、直線検出部にて検出された直線のうち、罫線に対応している可能性の高いものが抽出されることになる。よって、判定部での判定精度が向上することになる。 More specifically, the straight line interval calculation unit extracts one of the straight lines detected by the straight line detection unit that has a length equal to or greater than a predetermined length, and calculates the interval for the extracted straight line. As a result, among the straight lines detected by the straight line detection unit, one having a high possibility of corresponding to the ruled line is extracted. Therefore, the determination accuracy in the determination unit is improved.

上記画像処理装置の更なる他の具体的な構成において、画像処理装置は、膨張処理部を更に備え、膨張処理部は、原稿に記載されている文字が手書き文字であると判定部にて判定されたとき、画像データに対して膨張処理を施す。これにより、原稿において手書き文字の濃度が薄かったり手書き文字が掠れていたりした場合でも、処理後の画像は、文字が読み取り易い状態となる。 In still another specific configuration of the image processing apparatus, the image processing apparatus further includes an expansion processing unit, and the expansion processing unit determines that the character described in the document is a handwritten character by the determination unit. When it is done, the expansion process is applied to the image data. As a result, even if the density of handwritten characters is low or the handwritten characters are distorted in the original, the image after processing is in a state where the characters are easy to read.

本発明に係る画像処理プログラムは、画像処理装置に、文字認識処理と、認識文字計数処理と、判定処理とを実行させる。文字認識処理では、原稿の画像データに含まれる文字が認識される。認識文字計数処理では、文字認識処理により認識された文字の数がカウントされることにより、認識された文字の総数（認識文字数）が算出される。判定処理では、認識文字計数処理により算出された総数（認識文字数）に基づき、原稿に記載されている文字が手書き文字であるか否かが判定される。 An image processing program according to the present invention causes an image processing apparatus to execute character recognition processing, recognition character counting processing, and determination processing. In the character recognition process, characters included in image data of a document are recognized. In the recognition character counting process, the total number of recognized characters (the number of recognized characters) is calculated by counting the number of characters recognized by the character recognition process. In the determination process, it is determined whether the character described in the document is a handwritten character based on the total number (the number of recognized characters) calculated by the recognition character counting process.

本発明に係る他の画像処理プログラムは、画像処理装置に、エッジ抽出処理と、直線検出処理と、直線間隔算出処理と、判定処理とを実行させる。エッジ抽出処理では、原稿の画像データに含まれる画像のエッジを構成するエッジ画素が抽出される。直線検出処理では、エッジ抽出処理により抽出されたエッジ画素に基づき、画像データに含まれる直線を構成する画像のエッジに対応する直線が検出される。直線間隔算出処理では、直線検出処理により検出された直線の間隔が算出される。判定処理では、直線間隔算出処理により算出された間隔に基づき、原稿に記載されている文字が手書き文字であるか否かが判定される。 Another image processing program according to the present invention causes an image processing apparatus to execute edge extraction processing, straight line detection processing, straight line interval calculation processing, and determination processing. In the edge extraction process, edge pixels constituting an edge of an image included in image data of a document are extracted. In the straight line detection process, a straight line corresponding to an edge of an image constituting a straight line included in image data is detected based on the edge pixel extracted by the edge extraction process. In the straight line interval calculation process, the distance between straight lines detected by the straight line detection process is calculated. In the determination process, it is determined whether the character described in the document is a handwritten character based on the interval calculated by the linear interval calculation process.

本発明に係る画像処理装置及び画像処理プログラムによれば、原稿に記載されている文字が手書き文字であるのか否かを、簡単な処理であるにも拘らず高い精度で判定することが可能になる。 According to the image processing apparatus and the image processing program according to the present invention, it is possible to determine with high accuracy whether the character described in the document is a handwritten character or not despite the simple processing. Become.

本発明の画像処理装置が適用された画像形成装置（複合機）の構成、及びコピーモード選択時に画像処理装置にて実行される処理の流れを示したブロック図である。FIG. 2 is a block diagram showing the configuration of an image forming apparatus (multifunction machine) to which the image processing apparatus of the present invention is applied, and the flow of processing executed by the image processing apparatus when a copy mode is selected. 原稿自動判定部の構成、及び原稿自動判定部にて実行される判定処理の流れを示したブロック図である。FIG. 6 is a block diagram showing the configuration of an automatic document determination unit and the flow of determination processing executed by the automatic document determination unit. 膨張処理部による（ａ）処理前及び（ｂ）処理後の画像をそれぞれ模式的に示した図である。It is the figure which showed typically the image before (a) process by the expansion process part, and the image after (b) process. 原稿自動判定部が持つ文字判定部にて第１の判定処理が実行されるときの文字判定部の構成及び処理の流れを示したブロック図である。FIG. 6 is a block diagram showing the configuration of the character determination unit and the flow of processing when the first determination process is executed by the character determination unit of the automatic document determination unit. 第１の判定処理の流れを示したフローチャートである。It is the flowchart which showed the flow of the 1st decision processing. 文字判定部が持つ文字認識処理部の構成、及び文字認識処理部にて実行される認識処理の流れを示したブロック図である。It is the block diagram which showed the structure of the character recognition process part which a character determination part has, and the flow of the recognition process performed by a character recognition process part. 文字認識処理部が持つレイアウト解析部による文字抽出の説明に用いられる画像データの図である。It is a figure of the image data used for description of the character extraction by the layout analysis part which a character recognition process part has. 文字判定部にて第２の判定処理が実行されるときの文字判定部の構成及び処理の流れを示したブロック図である。FIG. 7 is a block diagram showing the configuration of the character determination unit and the flow of processing when the second determination process is executed by the character determination unit. 第２の判定処理の流れを示したフローチャートである。It is the flowchart which showed the flow of 2nd determination processing. 連続文字計数部にて実行される処理の流れを示したフローチャートである。It is the flowchart which showed the flow of the processing which is run with consecutive character counting section. 文字判定部にて第３の判定処理が実行されるときの文字判定部の構成及び処理の流れを示したブロック図である。It is the block diagram which showed the flow of the structure and process of a character determination part when 3rd determination processing is performed in a character determination part. 第３の判定処理の流れを示したフローチャートである。It is the flowchart which showed the flow of the 3rd decision processing. 直線検出部での処理の説明に用いられる図であって、（ａ）ハフ変換前の座標系（ｘ−ｙ座標系）と、（ｂ）ハフ変換後の座標系（ｒ−θ座標系）とを示した図である。It is a figure used for description of the process in a straight line detection part, Comprising: (a) Coordinate system before Hough transformation (xy coordinate system), Coordinate system after Hough transformation (r-theta coordinate system) And FIG. 直線間隔算出部にて算出される、隣接する２つの直線の間隔を示した図である。It is the figure which showed the space | interval of two adjacent straight lines calculated by the linear space | interval calculation part. 文字判定部にて第４の判定処理が実行されるときの文字判定部の構成及び処理の流れを示したブロック図である。FIG. 16 is a block diagram showing the configuration of the character determination unit and the flow of processing when the fourth determination process is executed by the character determination unit. 第１〜第４の判定処理が組み合わされたときの文字判定部の構成及び処理の流れを示したブロックである。It is the block which showed the flow of composition and processing of a character judging part when the 1st-4th judging processing were combined. 画像形成装置（複合機）について、イメージ送信モード選択時に画像処理装置にて実行される処理の流れを示したブロック図である。FIG. 6 is a block diagram showing the flow of processing executed by the image processing apparatus when an image transmission mode is selected for an image forming apparatus (multifunction peripheral). 本発明の画像処理装置が適用された他の画像形成装置（カラー画像読取装置）の構成、及びその画像形成装置が持つ画像処理装置にて実行される処理の流れを示したブロック図である。FIG. 6 is a block diagram showing the configuration of another image forming apparatus (color image reading apparatus) to which the image processing apparatus of the present invention is applied, and the flow of processing executed by the image processing apparatus of the image forming apparatus.

以下、本発明の画像処理装置を画像形成装置に適用した実施形態について、図面に沿って具体的に説明する。 Hereinafter, an embodiment in which the image processing apparatus of the present invention is applied to an image forming apparatus will be specifically described with reference to the drawings.

［１］第１実施形態
［１−１］画像形成装置の構成
図１に示される様に、画像形成装置は、画像入力装置１、画像処理装置２、画像出力装置３、送受信装置５、記憶部６、及び制御部７を備える。この画像形成装置は、様々な動作モード（コピーモード、プリンタモード、ファクシミリ送受信モード、イメージ送信モードなど）を持った複合機である。尚、図１は、コピーモードが選択されたときの画像形成装置の動作を示している。 [1] First Embodiment [1-1] Configuration of Image Forming Device As shown in FIG. 1, the image forming device includes an image input device 1, an image processing device 2, an image output device 3, a transmitting and receiving device 5, and a storage. A unit 6 and a control unit 7 are provided. The image forming apparatus is a multifunction peripheral having various operation modes (copy mode, printer mode, facsimile transmission / reception mode, image transmission mode, etc.). FIG. 1 shows the operation of the image forming apparatus when the copy mode is selected.

≪画像入力装置≫
画像入力装置１は、原稿台（図示せず）や原稿送り装置（図示せず）に置かれた原稿の画像を光学的に読み取る装置である。画像入力装置１は、例えば、ＣＣＤ（Charge Coupled Device）ラインセンサを持ったスキャナであり、原稿から反射してきた光をＲ（赤）、Ｇ（緑）、Ｂ（青）の３色に分解すると共に、分解後の光をそれぞれ電気信号（以下、「ＲＧＢアナログ信号」と称す。）に変換する。ＲＧＢアナログ信号は、画像処理装置２に入力される。尚、画像入力装置１の動作は、制御部７により制御される。ここで、制御部７は、ＣＰＵ（Central Processing Unit）やＤＳＰ（Digital Signal Processor）により構成される。 «Image input device»
The image input device 1 is a device that optically reads an image of a document placed on a document table (not shown) or a document feeding device (not shown). The image input device 1 is, for example, a scanner having a CCD (Charge Coupled Device) line sensor, and separates light reflected from a document into three colors of R (red), G (green) and B (blue). And convert the light after decomposition into electric signals (hereinafter referred to as "RGB analog signals"). The RGB analog signal is input to the image processing device 2. The operation of the image input apparatus 1 is controlled by the control unit 7. Here, the control unit 7 is configured of a central processing unit (CPU) and a digital signal processor (DSP).

≪画像処理装置（コピーモード選択時の処理）≫
画像入力装置１により原稿の読込みが実行された場合、画像処理装置２は、画像入力装置１にて生成されたＲＧＢアナログ信号に基づいて画像処理を実行する。このとき実行される画像処理は、画像形成装置にて選択されている動作モード（コピーモード、ファクシミリ送受信モード、イメージ送信モードなど）に応じて異なる。 << image processing device (processing when selecting copy mode) >>
When reading of a document is executed by the image input device 1, the image processing device 2 executes image processing based on the RGB analog signal generated by the image input device 1. The image processing performed at this time differs depending on the operation mode (copy mode, facsimile transmission / reception mode, image transmission mode, etc.) selected by the image forming apparatus.

ここでは、図１を参照して、コピーモード選択時に画像処理装置２にて実行される処理について説明する。具体的には、画像処理装置２では、Ａ／Ｄ変換部２０１、シェーディング補正部２０２、入力処理部２０３、原稿自動判定部２０４、膨張処理部２０５、領域分離処理部２０６、色補正部２０７、黒生成／下色除去部２０８、空間フィルタ部２０９、出力階調補正部２１０、及び中間調生成部２１１の各々が処理を実行する。尚、画像処理装置２の各部で実行される処理や、信号や画像データの受渡し等の動作は、制御部７により制御される。 Here, with reference to FIG. 1, the process executed by the image processing apparatus 2 when the copy mode is selected will be described. Specifically, in the image processing apparatus 2, the A / D conversion unit 201, the shading correction unit 202, the input processing unit 203, the automatic document determination unit 204, the expansion processing unit 205, the area separation processing unit 206, the color correction unit 207, Each of the black generation / under color removal unit 208, the spatial filter unit 209, the output tone correction unit 210, and the halftone generation unit 211 executes processing. The control unit 7 controls operations performed by the respective units of the image processing apparatus 2 and operations such as delivery of signals and image data.

＜Ａ／Ｄ変換部＞
Ａ／Ｄ変換部２０１は、画像処理装置２に入力されたＲＧＢアナログ信号を、デジタル信号（ＲＧＢデジタル信号）に変換することにより、ＲＧＢの３色で構成された画像データ（以下、「ＲＧＢ画像データ」と称す。）を生成する。 <A/D converter>
The A / D conversion unit 201 converts the RGB analog signal input to the image processing apparatus 2 into a digital signal (RGB digital signal) to generate image data composed of three colors of RGB (hereinafter referred to as “RGB image Data (referred to as data).

＜シェーディング補正部＞
シェーディング補正部２０２は、Ａ／Ｄ変換部２０１にて生成されたＲＧＢ画像データから、画像入力装置１の照明系、結像系、及び撮像系の各々で生じた歪みを取り除く。 <Shading correction unit>
The shading correction unit 202 removes distortion generated in each of the illumination system, the imaging system, and the imaging system of the image input device 1 from the RGB image data generated by the A / D conversion unit 201.

＜入力処理部＞
入力処理部２０３は、シェーディング補正後のＲＧＢ画像データに対して、γ補正処理を施す。γ補正後のＲＧＢ画像データは、記憶部６に格納される。そして、記憶部６に格納されたＲＧＢ画像データは、ファイリングデータとして、制御部７により管理される。 <Input processing unit>
The input processing unit 203 performs γ correction processing on the RGB image data after the shading correction. The RGB image data after γ correction is stored in the storage unit 6. Then, the RGB image data stored in the storage unit 6 is managed by the control unit 7 as filing data.

γ補正後のＲＧＢ画像データは、記憶部６に格納される際、例えばＪＰＥＧ圧縮アルゴリズムに基づいてＪＰＥＧコードに圧縮される。又、画像形成装置にて画像形成が実行される場合、記憶部６からＪＰＥＧコードが読み出され、このＪＰＥＧコードに対して復号化処理が施される。以下、復号化処理により生成されたＲＧＢ画像データを、特に断らない限り、単に「ＲＧＢ画像データ」と呼ぶことにする。尚、図１では、入力処理部２０３から原稿自動判定部２０４へ向かうＲＧＢ画像データが、復号化処理により生成されたＲＧＢ画像データに対応している。 When the gamma-corrected RGB image data is stored in the storage unit 6, it is compressed into a JPEG code, for example, based on a JPEG compression algorithm. When image formation is performed in the image forming apparatus, the JPEG code is read from the storage unit 6 and the JPEG code is subjected to decoding processing. Hereinafter, RGB image data generated by the decoding process will be simply referred to as “RGB image data” unless otherwise specified. In FIG. 1, RGB image data directed from the input processing unit 203 to the automatic document determination unit 204 correspond to RGB image data generated by the decoding process.

＜原稿自動判定部＞
原稿自動判定部２０４は、これに入力されたＲＧＢ画像データに基づき、画像入力装置１で読み取られた原稿（以下、「読取り原稿」と称す。）が、どのような種類の原稿であるのかを判別する。具体的には、図２に示される流れに従って、原稿自動判定部２０４では、原稿判定部２０４ａ及び文字判定部２０４ｂが、順に処理を実行する。 <Auto Document Judgment Unit>
The automatic document determination unit 204 determines what type of document the document read by the image input device 1 (hereinafter referred to as “read document”) is based on the RGB image data input thereto. Determine. Specifically, according to the flow shown in FIG. 2, in the automatic document determination unit 204, the document determination unit 204a and the character determination unit 204b execute processing in order.

原稿判定部２０４ａは、ＲＧＢ画像データに基づき、読取り原稿が、文字原稿、印刷写真原稿、文字と印刷写真とが混在した原稿、或いは、階調が連続的に変化している写真原稿の何れであるのかを判別する。一例として、判別方法には、従来の判別技術が用いられる（例えば、特開２００２−２３２７０８号公報参照）。尚、判別方法は、従来の判別技術に限定されるものではなく、種々の変形が可能である。 Based on the RGB image data, the document determination unit 204a selects either a text document, a printed photo document, a document in which characters and a printed photo are mixed, or a photo document in which the gradation changes continuously. Determine if there is. As an example, a conventional discrimination technique is used for the discrimination method (see, for example, JP-A-2002-232708). The determination method is not limited to the conventional determination technique, and various modifications are possible.

原稿判定部２０４ａでの判別結果のデータは、後述する文字判定部２０４ｂでの判定結果のデータ（文字判定データ）と共に、原稿判別データとして記憶部６に格納される。そして、原稿判別データは、必要に応じて、記憶部６から読み出されると共に、膨張処理部２０５、色補正部２０７、黒生成／下色除去部２０８、空間フィルタ部２０９、及び中間調生成部２１１にて使用される。 Data of the determination result in the document determination unit 204a is stored in the storage unit 6 as document determination data together with data (character determination data) of the determination result in the character determination unit 204b described later. Then, the document discrimination data is read out from the storage unit 6 as necessary, and the expansion processing unit 205, the color correction unit 207, the black generation / under color removal unit 208, the spatial filter unit 209, and the halftone generation unit 211. Used in

読取り原稿が文字原稿であると原稿判定部２０４ａにて判定されたとき、その結果を受けて、文字判定部２０４ｂが判定処理を実行する。具体的には、文字判定部２０４ｂは、原稿自動判定部２０４に入力されたＲＧＢ画像データに基づき、文字原稿に記載れている文字が手書き文字であるか否かを判定する。尚、文字判定部２０４ｂによる判定処理の対象となる原稿には、文字原稿の他に、印刷写真等が部分的に混在した原稿等が含まれてもよい。文字判定部２０４ｂにて実行される判定処理の詳細については、後述する。 When the document determination unit 204a determines that the read document is a text document, the character determination unit 204b receives the result and executes the determination process. Specifically, based on the RGB image data input to the automatic document determination unit 204, the character determination unit 204b determines whether the character described in the text document is a handwritten character. The document to be subjected to the determination process by the character determination unit 204b may include, in addition to the text document, a document or the like in which a printed picture or the like is partially mixed. Details of the determination process executed by the character determination unit 204b will be described later.

＜膨張処理部＞
読取り原稿が文字原稿であると原稿判定部２０４ａにて判定されると共に、文字原稿に記載されている文字が手書き文字であると文字判定部２０４ｂにて判定されたとき、その結果（原稿判別データ）を受けて、膨張処理部２０５が、ＲＢＧ画像データに対して膨張処理を施す。具体的には、膨張処理部２０５は、ＲＢＧ画像データに含まれる文字の部分に対して、その文字が太くなる様な画像処理を施す。なぜなら、手書き文字の場合（特に、鉛筆で書かれた文字の場合）、文字の濃度が薄かったり、文字が掠れていたりする虞があるためである。 <Expansion processing unit>
When the document determination unit 204a determines that the read document is a text document, and the character determination unit 204b determines that the character described in the text document is a handwritten character, the result (document determination data ), The expansion processing unit 205 performs expansion processing on the RBG image data. Specifically, the expansion processing unit 205 performs image processing such that the character is thickened on the portion of the character included in the RBG image data. This is because in the case of handwritten characters (particularly, in the case of characters written with a pencil), there is a risk that the density of the characters may be light or the characters may be distorted.

膨張処理の一例について、図３（ａ）及び（ｂ）を用いて説明する。図３（ａ）及び（ｂ）は、８ビット（２５６階調）で表現される画素値を持った画素を模式的に示したものであり、便宜的に、画素値が０、１００、及び２５５である３つの領域を示している。画素値が０及び１００である領域は、例えば文字の部分に相当する。尚、図３（ａ）及び（ｂ）では、説明を簡単にするために、１つの色成分の画素値のみが示されている。 An example of the expansion processing will be described with reference to FIGS. 3 (a) and 3 (b). FIGS. 3A and 3B schematically show pixels having pixel values represented by 8 bits (256 gradations), and for convenience, the pixel values are 0, 100, and It shows three regions of 255. Regions in which the pixel values are 0 and 100 correspond to, for example, a character portion. In FIGS. 3A and 3B, only pixel values of one color component are shown in order to simplify the description.

先ず、膨張処理部２０５は、図３（ａ）に示される各画素について、その画素（注目画素）の画素値と、その周辺に位置する８つの画素（周辺画素）の画素値とを比較する。その結果として、周辺画素の何れかの画素値が注目画素の画素値より小さかった場合、膨張処理部２０５は、注目画素の画素値を、それより小さい周辺画素の画素値に置き換える（図３（ｂ）参照）。これにより、注目画素の画素値が、その画素値及び周辺画素の画素値の中で最も小さいものとなる。 First, for each pixel shown in FIG. 3A, the expansion processing unit 205 compares the pixel value of the pixel (target pixel) with the pixel values of eight pixels (peripheral pixels) located around the pixel. . As a result, when any pixel value of the peripheral pixels is smaller than the pixel value of the target pixel, the expansion processing unit 205 replaces the pixel value of the target pixel with the pixel value of the smaller peripheral pixel (FIG. b) see). As a result, the pixel value of the target pixel is the smallest among the pixel values and the pixel values of the peripheral pixels.

この様な膨張処理により、画素値が０及び１００である領域が拡大することになる。よって、文字原稿において手書き文字の濃度が薄かったり手書き文字が掠れていたりした場合でも、処理後の画像は、文字が読み取り易い状態となる。 Such expansion processing enlarges the regions where the pixel values are 0 and 100. Therefore, even if the density of the handwritten character is thin or the handwritten character is distorted in the character original, the processed image is in a state where the character is easy to read.

＜領域分離処理部＞
領域分離処理部２０６は、ＲＧＢ画像データ（膨張処理部２０５にて膨張処理が施された場合には膨張処理後のＲＧＢ画像データ）に基づき、その画像データを構成する各画素について、その画素が属する画像領域の種類を判別する。ここで、画像領域には、黒文字領域、色文字領域、網点領域等の種類が存在する。判別結果のデータは、領域分離処理部２０６から出力されると共に、記憶部６に格納される。そして、その判別結果のデータ（領域分離データ）は、必要に応じて、記憶部６から読み出されると共に、黒生成／下色除去部２０８、空間フィルタ部２０９、及び中間調生成部２１１にて使用される。 <Area separation processing unit>
The region separation processing unit 206, based on RGB image data (RGB image data after expansion processing when expansion processing is performed by the expansion processing unit 205), the pixels of each pixel constituting the image data are Determine the type of image area to which it belongs. Here, in the image area, types such as a black character area, a color character area, and a halftone area exist. The data of the determination result is output from the region separation processing unit 206 and stored in the storage unit 6. Then, data of the discrimination result (region separation data) is read from the storage unit 6 as needed, and used by the black generation / under color removal unit 208, the spatial filter unit 209, and the halftone generation unit 211. Be done.

＜色補正部＞
色補正部２０７は、ＲＧＢ画像データ（膨張処理部２０５にて膨張処理が施された場合には膨張処理後のＲＧＢ画像データ）を、Ｃ（シアン）、Ｍ（マゼンダ）、Ｙ（イエロー）の３色で構成される画像データ（以下、「ＣＭＹ画像データ」と称す。）に変換する。即ち、色空間を、ＲＧＢ空間から、補色の関係にあるＣＭＹ空間に変換する。即ち、色補正部２０７は、ＲＧＢ画像データに対して、色再現性を高める処理を施す。このとき、原稿自動判定部２０４での判別結果（原稿判別データ）が、必要に応じて使用される。 <Color correction unit>
The color correction unit 207 sets RGB image data (RGB image data after expansion processing when expansion processing is performed by the expansion processing unit 205) to C (cyan), M (magenta), and Y (yellow). It is converted into image data composed of three colors (hereinafter referred to as "CMY image data"). That is, the color space is converted from the RGB space to the CMY space in a complementary color relationship. That is, the color correction unit 207 performs processing to improve color reproducibility on RGB image data. At this time, the determination result (original determination data) in the automatic document determination unit 204 is used as needed.

＜黒生成／下色除去部＞
黒生成／下色除去部２０８は、色補正部２０７にて生成されたＣＭＹ画像データを、Ｃ（シアン）、Ｍ（マゼンダ）、Ｙ（イエロー）、Ｋ（黒）の４色で構成される画像データ（以下、「ＣＭＹＫ画像データ」と称す。）に変換する。即ち、色空間を、ＣＭＹ空間からＣＭＹＫ空間に変換する。このとき、原稿自動判定部２０４及び領域分離処理部２０６での判別結果（原稿判別データ及び領域分離データ）が、必要に応じて使用される。 <Black generation / under color removal part>
The black generation / under color removal unit 208 includes the CMY image data generated by the color correction unit 207 in four colors of C (cyan), M (magenta), Y (yellow), and K (black). It is converted into image data (hereinafter referred to as "CMYK image data"). That is, the color space is converted from the CMY space to the CMYK space. At this time, determination results (original determination data and area separation data) in the original document automatic determination unit 204 and the area separation processing unit 206 are used as needed.

＜空間フィルタ部＞
空間フィルタ部２０９は、黒生成／下色除去部２０８にて生成されたＣＭＹＫ画像データに対して、強調処理や平滑化処理等の処理を施す。このとき、原稿自動判定部２０４及び領域分離処理部２０６での判別結果（原稿判別データ及び領域分離データ）が、必要に応じて使用される。 <Space filter section>
The spatial filter unit 209 subjects the CMYK image data generated by the black generation / undercolor removal unit 208 to processing such as enhancement processing and smoothing processing. At this time, determination results (original determination data and area separation data) in the original document automatic determination unit 204 and the area separation processing unit 206 are used as needed.

＜出力階調補正部＞
出力階調補正部２１０は、空間フィルタ部２０９による処理後のＣＭＹＫ画像データに対して、これを用紙等の記録媒体に出力したときに出力画像が適切な明るさを持つことができる様に、出力γ補正処理を施す。 <Output tone correction unit>
The output tone correction unit 210 outputs the CMYK image data processed by the spatial filter unit 209 to a recording medium such as a sheet of paper so that the output image can have appropriate brightness. An output γ correction process is performed.

＜中間調生成部＞
中間調生成部２１１は、出力階調補正部２１０による処理後のＣＭＹＫ画像データに対して、これを用紙等の記録媒体に出力したときに出力画像が読取り原稿の画像の階調を再現することができる様に、階調再現処理を施す。このとき、原稿自動判定部２０４及び領域分離処理部２０６での判別結果（原稿判別データ及び領域分離データ）が、必要に応じて使用される。 <Half tone generation unit>
When the halftone generation unit 211 outputs the CMYK image data processed by the output gradation correction unit 210 to a recording medium such as a sheet, the output image reproduces the gradation of the image of the read document. Perform tone reproduction processing so that At this time, determination results (original determination data and area separation data) in the original document automatic determination unit 204 and the area separation processing unit 206 are used as needed.

≪画像出力装置≫
画像出力装置３は、中間調生成部２１１による処理後のＣＭＹＫ画像データに基づき、用紙への画像形成を実行する。画像出力装置３は、電子写真方式プリンタやインクジェット方式プリンタ等、用紙に画像を印刷する装置である。尚、用紙には、普通紙、厚紙、ＯＨＰフィルム等、シート状の種々の記録媒体が含まれる。 «Image output device»
The image output device 3 executes image formation on a sheet based on the CMYK image data processed by the halftone generation unit 211. The image output apparatus 3 is an apparatus for printing an image on a sheet, such as an electrophotographic printer or an inkjet printer. The sheet includes various sheet-like recording media such as plain paper, thick paper, and OHP film.

［１−２］文字判定部の構成及び判定処理方法（第１の判定処理）
次に、上述した文字判定部２０４ｂにて実行される第１の判定処理について、具体的に説明する。この第１の判定処理は、ＲＧＢ画像データから認識される文字の総数に基づき、文字原稿に記載されている文字が手書き文字であるか否かを判定する処理である。尚、文字判定部２０４ｂにて実行される第２及び第３の判定処理、並びにその他の判定処理については、第２実施形態以降で説明する。 [1-2] Configuration of Character Determination Unit and Determination Processing Method (First Determination Processing)
Next, the first determination process executed by the above-described character determination unit 204b will be specifically described. The first determination process is a process of determining whether the character described in the text document is a handwritten character based on the total number of characters recognized from the RGB image data. The second and third determination processes executed by the character determination unit 204b and the other determination processes will be described in the second and subsequent embodiments.

文字判定部２０４ｂにて第１の判定処理が実行される場合、図４及び図５に示される流れに従って、文字判定部２０４ｂでは、エッジ抽出部４０１、文字認識処理部４０２、エッジ計数部４０３、認識文字計数部４０４、第１比較部４０５、及び第１判定部４０６の各々が処理を実行する。 When the first determination process is executed by the character determination unit 204b, according to the flow shown in FIGS. 4 and 5, the character determination unit 204b includes the edge extraction unit 401, the character recognition processing unit 402, the edge counting unit 403, Each of the recognized character counting unit 404, the first comparison unit 405, and the first determination unit 406 executes a process.

＜エッジ抽出部＞
エッジ抽出部４０１は、原稿自動判定部２０４に入力されたＲＧＢ画像データに基づき、その画像データに含まれる文字等の画像のエッジを抽出する（図５のステップＳ１１）。具体的には、エッジ抽出部４０１は、画像のエッジを構成する画素（以下、「エッジ画素」と称す。）を抽出する。一例として、抽出方法には、従来の抽出方法が用いられる（例えば、特開２００９−９４９０３号公報）。尚、抽出方法は、従来の抽出方法に限定されるものではなく、種々の変形が可能である。 <Edge extraction unit>
The edge extraction unit 401 extracts an edge of an image such as a character included in the image data based on the RGB image data input to the document automatic determination unit 204 (step S11 in FIG. 5). Specifically, the edge extraction unit 401 extracts pixels (hereinafter, referred to as “edge pixels”) that constitute an edge of an image. As an example, a conventional extraction method is used for the extraction method (for example, JP-A-2009-94903). The extraction method is not limited to the conventional extraction method, and various modifications are possible.

＜文字認識処理部＞
文字認識処理部４０２は、原稿自動判定部２０４に入力されたＲＧＢ画像データに基づき、その画像データに含まれる文字を認識する（図５のステップＳ１２）。具体的には、文字認識処理部４０２は、ＲＧＢ画像データに含まれる文字を抽出すると共に、抽出した文字がどのような文字であるかを判別する。より具体的には、図６に示される流れに従って、文字認識処理部４０２では、信号変換部４０２ａ、２値化処理部４０２ｂ、レイアウト解析部４０２ｃ、及び文字認識部４０２ｄの各々が処理を実行する。 <Character recognition processing unit>
The character recognition processing unit 402 recognizes characters included in the image data based on the RGB image data input to the document automatic determination unit 204 (step S12 in FIG. 5). Specifically, the character recognition processing unit 402 extracts characters included in RGB image data, and determines what character the extracted characters are. More specifically, according to the flow shown in FIG. 6, in the character recognition processing unit 402, each of the signal conversion unit 402a, the binarization processing unit 402b, the layout analysis unit 402c, and the character recognition unit 402d executes processing. .

（ａ）信号変換部
信号変換部４０２ａは、ＲＧＢデジタル信号を輝度信号に変換する。具体的には、信号変換部４０２ａは、ＲＧＢ画像データを構成する各画素の画素値（Ｉｒ、Ｉｇ、Ｉｂ）を輝度値Ｙに変換する。この変換には、例えば、下記の式（１）が用いられる。変換によって得られるデータを、以下では「輝度データ」と称す。 (A) Signal Converter The signal converter 402 a converts an RGB digital signal into a luminance signal. Specifically, the signal conversion unit 402a converts the pixel value (Ir, Ig, Ib) of each pixel constituting the RGB image data into the luminance value Y. For example, the following formula (1) is used for this conversion. The data obtained by the conversion is hereinafter referred to as "brightness data".

尚、輝度値Ｙへの変換は、式（１）に従った変換に限定されるものではなく、種々の変形が可能である。又、信号変換部４０２ａは、ＲＧＢデジタル信号を、ＣＩＥ１９７６Ｌ^＊ａ^＊ｂ^＊信号（ＣＩＥ：Commission Internationale de l'Eclairage、Ｌ^＊：明度、ａ^＊及びｂ^＊：色度）等の信号に変換してもよい。更には、信号変換部４０２ａは、ＲＧＢデジタル信号を他の信号に変換することに代えて、ＲＧＢ信号からＧ信号のみを抽出してもよい。 The conversion to the luminance value Y is not limited to the conversion according to the equation (1), and various modifications are possible. Further, the signal conversion unit 402a converts RGB digital signals into signals such as CIE 1976 L ^* a ^* b ^* signals (CIE: Commission Internationale de l'Eclairage, L ^* : lightness, a ^* and b ^* : chromaticity). May be Furthermore, the signal conversion unit 402a may extract only the G signal from the RGB signal, instead of converting the RGB digital signal into another signal.

（ｂ）２値化処理部
２値化処理部４０２ｂは、信号変換部４０２ａにて生成された輝度データに対して、閾値処理を施す。例えば、輝度データを構成する各画素の輝度値Ｙが８ビットで表されている場合、２値化処理部４０２ｂは、各画素について、輝度値Ｙが閾値（例えば、１２８）以上である場合にはその輝度値Ｙを２５５（白）に変換し、輝度値Ｙが閾値より小さい場合にはその輝度値Ｙを０（黒）に変換する。尚、閾値には、注目画素を含む複数の画素（例えば、注目画素を中心とする５×５の画素）がそれぞれ持つ輝度値Ｙについての平均値が用いられてもよい。 (B) Binarization Processing Unit The binarization processing unit 402b performs threshold processing on the luminance data generated by the signal conversion unit 402a. For example, when the luminance value Y of each pixel constituting the luminance data is represented by 8 bits, the binarization processing unit 402b determines, for each pixel, that the luminance value Y is equal to or greater than a threshold (for example, 128). Converts the luminance value Y into 255 (white), and when the luminance value Y is smaller than the threshold value, converts the luminance value Y into 0 (black). Note that as the threshold value, an average value of luminance values Y may be used that each of a plurality of pixels including the pixel of interest (for example, 5 × 5 pixels centered on the pixel of interest) has.

（ｃ）レイアウト解析部
レイアウト解析部４０２ｃは、２値化処理後の輝度データ（画像データ）に基づいてレイアウト解析を行うことにより、輝度データに含まれている文字（記号等を含む。）を抽出する。具体的には、レイアウト解析部４０２ｃは、先ず、輝度データに対してラベリング処理を施すことにより、複数の黒色部分に対して異なる番号ｉをそれぞれ割り振る。ここで、黒色部分は、黒色の画素が連続するものであり、同じ黒色部分を構成する画素に対しては、同じ番号ｉが付けられる。これにより、ラベリングされた黒色部分の各々が、１つの文字として抽出される。 (C) Layout Analysis Unit The layout analysis unit 402c performs layout analysis based on the luminance data (image data) after binarization processing, and thereby characters (including symbols and the like) included in the luminance data. Extract. Specifically, the layout analysis unit 402c first assigns different numbers i to a plurality of black portions by performing a labeling process on luminance data. Here, the black part is a series of black pixels, and the same number i is given to the pixels constituting the same black part. Thereby, each of the labeled black parts is extracted as one character.

その後、図７に示される様に、レイアウト解析部４０２ｃは、ラベリングされた黒色部分の各々について外接矩形Ｒａ（ｉ）を設定する。このとき、レイアウト解析部４０２ｃは、番号ｉの黒色部分を構成する画素から、最上端、最下端、最右端、最左端に位置する画素を抽出し、これらについて、輝度データに設定された２次元直交座標系での座標を求める。そして、それらの座標に基づき、外接矩形Ｒａ（ｉ）を設定する。そして、輝度データから抽出された文字（以下、「抽出文字」と称す。）の番号ｉと、外接矩形Ｒａ（ｉ）とが、互いに対応付けられて記憶部６に記憶される。このとき、外接矩形Ｒａ（ｉ）を表すデータとして、例えば、その外接矩形Ｒａ（ｉ）の左上頂点の座標と右下頂点の座標とが用いられる。 Thereafter, as shown in FIG. 7, the layout analysis unit 402c sets a circumscribed rectangle Ra (i) for each of the labeled black portions. At this time, the layout analysis unit 402c extracts the pixels located at the uppermost end, the lowermost end, the rightmost end, and the leftmost end from the pixels forming the black part of the number i. Find the coordinates in the Cartesian coordinate system. Then, the circumscribed rectangle Ra (i) is set based on those coordinates. Then, the number i of characters extracted from the luminance data (hereinafter referred to as “extracted characters”) and the circumscribed rectangle Ra (i) are stored in the storage unit 6 in association with each other. At this time, as data representing the circumscribed rectangle Ra (i), for example, the coordinates of the upper left vertex and the coordinates of the lower right vertex of the circumscribed rectangle Ra (i) are used.

（ｄ）文字認識部
文字認識部４０２ｄは、レイアウト解析部４０２ｃで得られた抽出文字がどのような文字であるかを判別する。具体的には先ず、文字認識部４０２ｄは、抽出文字のサイズを、登録文字のサイズと同じになる様に変更する。ここで、登録文字は、記憶部６に辞書データとして登録された所定サイズ（例えば、１０ポイント）の活字文字である。より具体的には、文字認識部４０２ｄは、抽出文字（番号ｉ）の各々について、外接矩形Ｒａ（ｉ）の大きさが登録文字のサイズに対応したものとなる様に、抽出文字の横幅と高さとをそれぞれ拡大又は縮小する。即ち、文字認識部４０２ｄは、抽出文字のサイズを正規化する。 (D) Character Recognition Unit The character recognition unit 402d determines what kind of character the extracted character obtained by the layout analysis unit 402c is. Specifically, first, the character recognition unit 402d changes the size of the extracted character so as to be the same as the size of the registered character. Here, the registered characters are print characters of a predetermined size (for example, 10 points) registered in the storage unit 6 as dictionary data. More specifically, for each of the extracted characters (number i), the character recognition unit 402d sets the width of the extracted characters so that the size of the circumscribed rectangle Ra (i) corresponds to the size of the registered character. The height is enlarged or reduced, respectively. That is, the character recognition unit 402d normalizes the size of the extracted character.

その後、文字認識部４０２ｄは、正規化された抽出文字を、辞書データ内の登録文字と照合する。その結果として、抽出文字が登録文字の何れかと合致したとき、文字認識部４０２ｄは、その抽出文字を、合致した登録文字と同じ文字であると認識する。 Thereafter, the character recognition unit 402d collates the normalized extracted character with the registered character in the dictionary data. As a result, when the extracted character matches any of the registered characters, the character recognition unit 402d recognizes the extracted character as the same character as the matched registered character.

この様にして得られた文字認識部４０２ｄでの認識結果のデータは、抽出文字に関する情報（例えば、割り付けられた番号ｉや、外接矩形Ｒａ（ｉ）の頂点座標等）と共に、文字認識データとして、文字認識処理部４０２から出力される（図４参照）。 The data of the recognition result obtained by the character recognition unit 402d obtained in this way is character recognition data together with information on the extracted character (for example, assigned number i, vertex coordinates of circumscribed rectangle Ra (i), etc.) , And from the character recognition processing unit 402 (see FIG. 4).

＜エッジ計数部＞
エッジ計数部４０３は、エッジ抽出部４０１にて抽出されたエッジ画素の数をカウントする。これにより、エッジ計数部４０３は、ＲＧＢ画像データに含まれるエッジ画素の総数（エッジ画素数Ｃｅ）を算出する（図５のステップＳ１３）。そして、算出されたエッジ画素数Ｃｅは、データとしてエッジ計数部４０３から出力される。 <Edge counting unit>
The edge counting unit 403 counts the number of edge pixels extracted by the edge extracting unit 401. Thus, the edge counting unit 403 calculates the total number of edge pixels (the number of edge pixels Ce) included in the RGB image data (step S13 in FIG. 5). Then, the calculated edge pixel number Ce is output from the edge counting unit 403 as data.

＜認識文字計数部＞
認識文字計数部４０４は、文字認識処理部４０２にて認識された文字の数をカウントする。これにより、認識文字計数部４０４は、ＲＧＢ画像データから認識された文字の総数（認識文字数Ｃｒ）を算出する（図５のステップＳ１４）。そして、算出された認識文字数Ｃｒは、データとして認識文字計数部４０４から出力される。 <Recognized Character Counting Unit>
The recognized character counting unit 404 counts the number of characters recognized by the character recognition processing unit 402. Thus, the recognized character counting unit 404 calculates the total number of characters recognized from the RGB image data (the number of recognized characters Cr) (step S14 in FIG. 5). Then, the calculated recognition character number Cr is output from the recognition character counting unit 404 as data.

＜第１比較部＞
第１比較部４０５は、エッジ計数部４０３から出力されたエッジ画素数Ｃｅと、認識文字計数部４０４から出力された認識文字数Ｃｒとに基づき、これらの比Ｃｅ／Ｃｒを算出する。その後、第１比較部４０５は、算出した比Ｃｅ／Ｃｒが閾値Ｔｈ１より大きいか否かを判断する（図５のステップＳ１５）。ここで、閾値Ｔｈ１は、例えば１００００に設定される。 <First comparison unit>
The first comparison unit 405 calculates the ratio Ce / Cr based on the number of edge pixels Ce output from the edge counting unit 403 and the recognition character number Cr output from the recognized character counting unit 404. After that, the first comparison unit 405 determines whether the calculated ratio Ce / Cr is larger than the threshold Th1 (step S15 in FIG. 5). Here, the threshold Th1 is set to, for example, 10000.

＜第１判定部＞
第１判定部４０６は、第１比較部４０５の判断結果に基づき、文字原稿に記載れている文字が手書き文字であるのか或いは活字文字であるのかを判定する。具体的には、第１比較部４０５にて「比Ｃｅ／Ｃｒが閾値Ｔｈ１より大きい」と判断された場合、第１判定部４０６は、「文字原稿に記載れている文字は手書き文字である」と判定する（図５のステップＳ１６）。一方、第１比較部４０５にて「比Ｃｅ／Ｃｒが閾値Ｔｈ１より大きくない」と判断された場合、第１判定部４０６は、「文字原稿に記載れている文字は活字文字である」と判定する（図５のステップＳ１７）。 <First determination unit>
The first determination unit 406 determines, based on the determination result of the first comparison unit 405, whether the character described in the text document is a handwritten character or a printed character. Specifically, when the first comparison unit 405 determines that “the ratio Ce / Cr is larger than the threshold Th1”, the first determination unit 406 determines that the character described in the character original is a handwritten character. (Step S16 in FIG. 5). On the other hand, when it is determined by the first comparison unit 405 that “the ratio Ce / Cr is not larger than the threshold Th1”, the first determination unit 406 states that “the character described in the character original is a print character”. It determines (step S17 of FIG. 5).

手書き文字の原稿と活字文字の原稿とでは、エッジ画素数Ｃｅには、あまり大きな差が生じない。一方、認識文字数Ｃｒは、活字文字の原稿に比べて手書き文字の原稿の方が、著しく小さくなる傾向にある。なぜなら、文字を書く人によって文字の形態は様々であり、文字認識処理部での文字認識の精度が低下するからである。従って、比Ｃｅ／Ｃｒは、活字文字の原稿に比べて手書き文字の原稿の方が、著しく大きくなる傾向にある。そして、この様な傾向に基づき閾値Ｔｈ１が適切な値に設定されることにより、比Ｃｅ／Ｃｒと閾値Ｔｈ１との大小関係に基づいた判定が実現される。 The edge pixel number Ce does not have a large difference between the manuscript of the handwritten character and the manuscript of the print character. On the other hand, the recognition character number Cr tends to be significantly smaller in the manuscript of the handwritten character than in the manuscript of the print character. The reason is that the character form varies depending on the person writing the character, and the accuracy of character recognition in the character recognition processing unit is reduced. Therefore, the ratio Ce / Cr tends to be significantly larger in the handwritten character original as compared to the printed character original. Then, by setting the threshold value Th1 to an appropriate value based on such a tendency, the determination based on the magnitude relationship between the ratio Ce / Cr and the threshold value Th1 is realized.

この様にして得られた文字判定部２０４ｂでの判定結果のデータ（文字判定データ）は、上述した原稿判定部２０４ａでの判別結果のデータと共に、原稿判別データとして記憶部６に格納される。そして、原稿判別データは、必要に応じて、記憶部６から読み出されると共に、膨張処理部２０５、色補正部２０７、黒生成／下色除去部２０８、空間フィルタ部２０９、及び中間調生成部２１１にて使用される。 The data (character determination data) of the determination result of the character determination unit 204b obtained in this manner is stored in the storage unit 6 as document determination data together with the data of the determination result of the document determination unit 204a described above. Then, the document discrimination data is read out from the storage unit 6 as necessary, and the expansion processing unit 205, the color correction unit 207, the black generation / under color removal unit 208, the spatial filter unit 209, and the halftone generation unit 211. Used in

上述した第１の判定処理によれば、簡単な処理であるにも拘らず、原稿に記載されている文字が手書き文字であるのか否かの判定が高い精度で実行される。尚、第１の判定処理は、日本語に限らず、外国語にも適用することが可能である。 According to the first determination process described above, in spite of the simple process, it is possible to determine with high accuracy whether or not the character described in the document is a handwritten character. The first determination process can be applied not only to Japanese but also to foreign languages.

上述した第１の判定処理は、画像処理装置に判定処理プログラムを実行させることにより、実現されてもよい。そして、その様な判定処理プログラムは、読み取り可能な状態で記録媒体（ハードディスクやメモリカード等）に記録されていてもよい。 The first determination process described above may be realized by causing the image processing apparatus to execute a determination process program. Then, such a determination processing program may be recorded on a recording medium (hard disk, memory card, etc.) in a readable state.

［２］第２実施形態
［２−１］文字判定部の構成及び判定処理方法（第２の判定処理）
文字判定部２０４ｂにて実行される第２の判定処理について、具体的に説明する。この第２の判定処理は、ＲＧＢ画像データから認識される文字の総数に基づき、文字原稿に記載されている文字が手書き文字であるか否かを判定する処理である。 [2] Second Embodiment [2-1] Configuration of Character Determination Unit and Determination Processing Method (Second Determination Processing)
The second determination process executed by the character determination unit 204b will be specifically described. The second determination process is a process of determining whether the character described in the text document is a handwritten character based on the total number of characters recognized from the RGB image data.

文字判定部２０４ｂにて第２の判定処理が実行される場合、図８及び図９に示される流れに従って、文字判定部２０４ｂでは、文字認識処理部４０２、認識文字計数部４０４、連続文字計数部４０７、第２比較部４０８、及び第２判定部４０９の各々が処理を実行する。尚、文字認識処理部４０２及び認識文字計数部４０４が実行する処理（図９のステップＳ２１及びＳ２２）については、第１実施形態にて説明した通りである。 When the second determination process is executed by the character determination unit 204b, according to the flow shown in FIGS. 8 and 9, the character determination unit 204b recognizes the character recognition processing unit 402, the recognized character counting unit 404, and the continuous character counting unit Each of 407, the 2nd comparison part 408, and the 2nd determination part 409 performs a process. The processes (steps S21 and S22 in FIG. 9) executed by the character recognition processing unit 402 and the recognized character counting unit 404 are as described in the first embodiment.

＜連続文字計数部＞
連続文字計数部４０７は、文字認識処理部４０２にて認識された文字のうち、サイズが同じであると判断され且つ所定個（Ｎ個。例えばＮ＝５）以上連続する文字の数をカウントする。これにより、連続文字計数部４０７は、連続する文字の総数（連続文字数Ｃａ）を算出する（図９のステップＳ２３）。 <Consecutive character counting unit>
Continuous character counting section 407, among the recognized characters in the character recognition processing unit 402, and a predetermined number of (N. For example N = 5) is determined to be the same size the number of more continuous to Rubun shaped Count. Thus, the continuous character counting unit 407 calculates the total number of continuous characters (the number of consecutive characters Ca) (step S23 in FIG. 9).

具体的には、連続文字計数部４０７は、図１０に示される流れに従って、連続文字数Ｃａを算出する。先ず、連続文字計数部４０７は、番号ｉ、カウント変数ｐ、及び連続文字数Ｃａを全て０に設定すると共に、パラメータＮを５に設定する（ステップＳ２０１）。 Specifically, the consecutive character counting unit 407 calculates the number of consecutive characters Ca in accordance with the flow shown in FIG. First, the consecutive character counting unit 407 sets all the number i, the count variable p, and the number of consecutive characters Ca to 0, and sets the parameter N to 5 (step S201).

ここで、番号ｉは、文字認識処理部４０２のレイアウト解析部４０２ｃにて抽出文字Ａ（ｉ）に割り付けられた番号である（図７参照）。具体的には、読取り原稿が横書きの文字原稿であった場合、番号ｉは、左上に位置する抽出文字を１番目（ｉ＝１）として、右側に並ぶ抽出文字に番号が順に割り付けられる（ｉ＝２、・・・）。又、ある行の右端の抽出文字に番号ｍ（ｉ＝ｍ）が割り付けられたとき、次の番号ｍ＋１（ｉ＝ｍ＋１）は、１つ下の行の左端に位置する抽出文字に割り付けられる。一方、読取り原稿が縦書きの文字原稿であった場合、右上の抽出文字を１番目（ｉ＝１）として、下側に並ぶ抽出文字に番号ｉが順に割り付けられる（ｉ＝２、・・・）。又、ある列の下端の抽出文字に番号ｍ（ｉ＝ｍ）が割り付けられたとき、次の番号ｍ＋１（ｉ＝ｍ＋１）は、左隣りの列の上端に位置する抽出文字に割り付けられる。 Here, the number i is a number assigned to the extracted character A (i) by the layout analysis unit 402c of the character recognition processing unit 402 (see FIG. 7). Specifically, when the read original is a horizontally written character original, the number i is sequentially assigned to the extracted characters arranged on the right with the extracted character located at the upper left as the first (i = 1) (i = 2, ...). Also, when the number m (i = m) is assigned to the rightmost extracted character of a line, the next number m + 1 (i = m + 1) is assigned to the extracted character located at the left end of the next lower line. On the other hand, when the read original is a vertically-written character original, the extracted character in the upper right is regarded as the first (i = 1), and the number i is sequentially assigned to the extracted characters arranged in the lower side (i = 2,. ). Also, when the number m (i = m) is assigned to the extracted character at the lower end of a certain row, the next number m + 1 (i = m + 1) is allocated to the extracted character located at the upper end of the left adjacent row.

尚、読取り原稿が横書きと縦書きの何れの文字原稿であるのかは、例えば次の様にして判別される。即ち、図７に示される様に、外接矩形Ｒａ（ｉ）の頂点座標から、横方向において隣接する２つの抽出文字の間隔ｈ１と、縦方向において隣接する２つの抽出文字の間隔ｈ２とが算出され、これらの間隔ｈ１及びｈ２が比較される。そして、横方向の間隔ｈ１が縦方向の間隔ｈ２より小さい場合、横書きと判定され、横方向の間隔ｈ１が縦方向の間隔ｈ２より大きい場合、縦書きと判定される。 Incidentally, whether the read original is a horizontal original or a vertical original is determined, for example, as follows. That is, as shown in FIG. 7, from the vertex coordinates of the circumscribed rectangle Ra (i), the distance h1 between two adjacent extracted characters in the horizontal direction and the distance h2 between two adjacent extracted characters in the vertical direction are calculated. And these intervals h1 and h2 are compared. When the horizontal interval h1 is smaller than the vertical interval h2, it is determined that horizontal writing is performed, and when the horizontal interval h1 is larger than the vertical interval h2, it is determined that vertical writing is performed.

パラメータＮは、文字の連続性を判定する際の基準となるものである。具体的には、パラメータＮは、幾つかの文字が連続するときに連続文字としてカウントするか否かを判断する基準となるものである。尚、パラメータＮは、５以外の自然数（好ましくは２以上）に設定されてもよい。 The parameter N is a reference when determining the continuity of characters. Specifically, the parameter N is a standard for determining whether or not several characters are counted as consecutive characters. The parameter N may be set to a natural number other than 5 (preferably 2 or more).

ステップＳ２０１の実行後、連続文字計数部４０７は、ステップＳ２０２にて番号ｉを１つ大きくすると共に、そのときの番号ｉに対応する抽出文字Ａ（ｉ）の情報を、ステップＳ２０３にて文字認識データから取得する。その後、ステップＳ２０４において、連続文字計数部４０７は、番号ｉに対応する抽出文字Ａ（ｉ）が文字認識処理部４０２により認識された文字であるか否かを、取得した情報に基づいて判断する。 After execution of step S201, the continuous character counting unit 407 increments the number i by 1 in step S202, and recognizes information of the extracted character A (i) corresponding to the number i at that time in step S203. Get from the data. Thereafter, in step S204, continuous character counting unit 407 determines whether or not extracted character A (i) corresponding to number i is a character recognized by character recognition processing unit 402 based on the acquired information. .

ステップＳ２０４にて「抽出文字Ａ（ｉ）は認識された文字である」との判断結果が得られた場合、連続文字計数部４０７は、ステップＳ２０５において、カウント変数ｐが０に等しいか否かを判断する。一方、ステップＳ２０４にて「抽出文字Ａ（ｉ）は認識された文字でない」との判断結果が得られた場合、連続文字計数部４０７は、ステップＳ２１３へ移行する。 When the determination result that "the extracted character A (i) is a recognized character" is obtained in step S204, the continuous character counting unit 407 determines whether the count variable p is equal to 0 in step S205. To judge. On the other hand, when the determination result that "the extracted character A (i) is not a recognized character" is obtained in step S204, the continuous character counting unit 407 proceeds to step S213.

ステップＳ２０５にて「カウント変数ｐは０に等しくない」との判断結果が得られた場合、連続文字計数部４０７は、次のステップＳ２０８へ移行する。一方、ステップＳ２０５にて「カウント変数ｐは０に等しい」との判断結果が得られた場合、連続文字計数部４０７は、ステップＳ２０６にてカウント変数ｐの値を０から１に変更する。その後、連続文字計数部４０７は、ステップＳ２１３へ移行する。 If the determination result that "the count variable p is not equal to 0" is obtained in step S205, the continuous character counting unit 407 proceeds to the next step S208. On the other hand, when the determination result that "the count variable p is equal to 0" is obtained in step S205, the continuous character counting unit 407 changes the value of the count variable p from 0 to 1 in step S206. Thereafter, the consecutive character counting unit 407 proceeds to step S213.

ステップＳ２０８では、連続文字計数部４０７は、番号ｉに対応する抽出文字Ａ（ｉ）のサイズが、１つ前の番号（ｉ−１）に対応する抽出文字Ａ（ｉ−１）のサイズと同じであるか否かを判断する。具体的には、連続文字計数部４０７は、これらの抽出文字Ａ（ｉ）及びＡ（ｉ−１）にそれぞれ対応する外接矩形Ｒａ（ｉ）及びＲａ（ｉ−１）の大きさを比較することにより、抽出文字Ａ（ｉ）及びＡ（ｉ−１）のサイズが同じであるか否かを判断する。このとき、連続文字計数部４０７は、外接矩形Ｒａ（ｉ）及びＲａ（ｉ−１）の高さを比較して判断を行ってもよいし、外接矩形Ｒａ（ｉ）及びＲａ（ｉ−１）の高さと横幅の両方を比較して判断を行ってもよい。 In step S208, the consecutive character counting unit 407 determines that the size of the extracted character A (i) corresponding to the number i is the size of the extracted character A (i-1) corresponding to the immediately preceding number (i-1). Determine if they are the same. Specifically, continuous character counting unit 407 compares the sizes of circumscribed rectangles Ra (i) and Ra (i-1) corresponding to these extracted characters A (i) and A (i-1), respectively. Thus, it is determined whether the sizes of the extracted characters A (i) and A (i-1) are the same. At this time, the continuous character counting unit 407 may make the determination by comparing the heights of the circumscribed rectangles Ra (i) and Ra (i-1), or may determine the circumscribed rectangles Ra (i) and Ra (i-1). Judgment may be made by comparing both the height and width of

より具体的には、外接矩形Ｒａ（ｉ）及びＲａ（ｉ−１）の高さを比較する場合、連続文字計数部４０７は、それらの高さをそれぞれ、外接矩形Ｒａ（ｉ）及びＲａ（ｉ−１）についての高さ方向の画素数で認識する。又、外接矩形Ｒａ（ｉ）及びＲａ（ｉ−１）の横幅を比較する場合、連続文字計数部４０７は、それらの横幅をそれぞれ、外接矩形Ｒａ（ｉ）及びＲａ（ｉ−１）についての横方向の画素数で認識する。そして、連続文字計数部４０７は、画素数を比較することにより、サイズが同じであるか否かを判断する。このとき、連続文字計数部４０７は、比較する２つの画素数の差が所定範囲内であれば、サイズは同じであると判断する。 More specifically, when comparing the heights of the circumscribed rectangles Ra (i) and Ra (i-1), the consecutive character counting unit 407 sets the heights of the circumscribed rectangles Ra (i) and Ra (i). Recognize by the number of pixels in the height direction for i-1). Also, when comparing the width of circumscribed rectangles Ra (i) and Ra (i-1), the consecutive character counting unit 407 determines the width of each of the circumscribed rectangles Ra (i) and Ra (i-1). Recognize by the number of pixels in the horizontal direction. Then, the consecutive character counting unit 407 determines whether the sizes are the same by comparing the number of pixels. At this time, the continuous character counting unit 407 determines that the sizes are the same if the difference between the two pixel numbers to be compared is within a predetermined range.

ステップＳ２０８にて「サイズは同じである」との判断結果が得られた場合、連続文字計数部４０７は、ステップＳ２０９にてカウント変数ｐの値を１だけ大きくする。即ち、ステップＳ２０９では、連続する文字の数がカウントされる。その後、連続文字計数部４０７は、次のステップＳ２１３へ移行する。一方、ステップＳ２０８にて「サイズは同じでない」との判断結果が得られた場合、連続文字計数部４０７は、ステップＳ２１０にてカウント変数ｐの値がパラメータＮ以上であるか否か判断する。 When the determination result that "the size is the same" is obtained in step S208, the continuous character counting unit 407 increases the value of the count variable p by 1 in step S209. That is, in step S209, the number of consecutive characters is counted. Thereafter, the continuous character counting unit 407 proceeds to the next step S213. On the other hand, when the determination result that "the size is not the same" is obtained in step S208, the continuous character counting unit 407 determines whether the value of the count variable p is equal to or more than the parameter N in step S210.

ステップＳ２１０にて「カウント変数ｐの値はパラメータＮ以上である」との判断結果が得られた場合、連続文字計数部４０７は、ステップＳ２１１にて連続文字数Ｃａの値をカウント変数ｐの値だけ大きくする。一方、ステップＳ２１０にて「カウント変数ｐの値はパラメータＮ以上でない」との判断結果が得られた場合、連続文字計数部４０７は、連続文字数Ｃａの値を変更せずにそのまま（即ち、ステップＳ２１１を実行せずに）、ステップＳ２１２へ移行する。即ち、連続する文字の数がＮ個以上の場合にのみ、カウント変数ｐの値が連続文字数Ｃａに計上される。ステップＳ２１２では、連続文字計数部４０７は、カウント変数ｐを０にリセットする。その後、連続文字計数部４０７は、次のステップＳ２１３へ移行する。 If it is determined in step S210 that "the value of the count variable p is equal to or greater than the parameter N", the continuous character counting unit 407 determines the value of the number of consecutive characters Ca by the value of the count variable p in step S211. Enlarge. On the other hand, when the determination result that “the value of the count variable p is not the parameter N or more” is obtained in step S210, the continuous character counting unit 407 does not change the value of the number of continuous characters Ca (that is, step ) And proceeds to step S212. That is, only when the number of consecutive characters is N or more, the value of the count variable p is counted as the number of consecutive characters Ca. In step S212, the continuous character counting unit 407 resets the count variable p to zero. Thereafter, the continuous character counting unit 407 proceeds to the next step S213.

ステップＳ２１３では、連続文字計数部４０７は、番号ｉが最後の番号Ｅに到達したか否かを判断する。ここで、番号Ｅは、レイアウト解析部４０２ｃで抽出された抽出文字の総数に相当する。そして、ステップＳ２１３にて「番号ｉは最後の番号Ｅに到達した」との判断結果が得られた場合、連続文字計数部４０７は、ステップＳ２１４へ移行する。一方、ステップＳ２１３にて「番号ｉは最後の番号Ｅに到達していない」との判断結果が得られた場合、連続文字計数部４０７は、ステップＳ２０２に戻り、ステップＳ２０２からの処理を再び実行する。 In step S213, the consecutive character counting unit 407 determines whether the number i has reached the last number E. Here, the number E corresponds to the total number of extracted characters extracted by the layout analysis unit 402c. When the determination result that "the number i has reached the last number E" is obtained in step S213, the continuous character counting unit 407 proceeds to step S214. On the other hand, when the determination result that "the number i has not reached the last number E" is obtained in step S213, the consecutive character counting unit 407 returns to step S202, and executes the processing from step S202 again. Do.

ステップＳ２１４では、連続文字計数部４０７は、カウント変数ｐの値がパラメータＮ以上であるか否か判断する。そして、ステップＳ２１４にて「カウント変数ｐの値はパラメータＮ以上である」との判断結果が得られた場合、連続文字計数部４０７は、ステップＳ２１５にて連続文字数Ｃａの値をカウント変数ｐの値だけ大きくし、その後、文字のカウント処理を終了する。一方、ステップＳ２１４にて「カウント変数ｐの値はパラメータＮ以上でない」との判断結果が得られた場合、連続文字計数部４０７は、連続文字数Ｃａの値を変更せずにそのまま（即ち、ステップＳ２１５を実行せずに）、文字のカウント処理を終了する。 In step S214, the continuous character counting unit 407 determines whether the value of the count variable p is equal to or greater than the parameter N. When the determination result that "the value of the count variable p is equal to or greater than the parameter N" is obtained in step S214, the continuous character counting unit 407 determines the value of the number of consecutive characters Ca as the count variable p in step S215. Increase the value, and then finish character counting. On the other hand, when the determination result that "the value of count variable p is not the parameter N or more" is obtained in step S214, the consecutive character counting unit 407 does not change the value of the number of consecutive characters Ca, that is (step The character count process is ended without executing S215).

即ち、連続する文字の数をカウントしている途中で（カウント変数ｐが１以上の値であるときに）、番号ｉが最後の番号Ｅに到達した場合であっても、カウント中の文字数がＮ個以上であれば、連続文字数Ｃａに計上される。 That is, while counting the number of consecutive characters (when the count variable p has a value of 1 or more), the number of characters being counted is, even when the number i reaches the final number E. If the number is N or more, it is counted as the number of consecutive characters Ca.

上記カウント処理において、カウント対象となったＮ個以上の連続文字を１つの文字群と考えた場合、同じ文字群に属する全ての文字については同じサイズであることが要求されるが、文字群ごとにはサイズは異なっていてもよい。そして、上記カウント処理によれば、全ての文字群を対象として、それらの文字群に属する文字の総数（連続文字数Ｃａ）が算出される。算出された連続文字数Ｃａは、データとして連続文字計数部４０７から出力される。 In the above counting process, when N or more consecutive characters targeted for counting are considered as one character group, all characters belonging to the same character group are required to have the same size. The size may be different. Then, according to the above-described counting process, the total number of characters (the number of consecutive characters Ca) belonging to the character groups is calculated for all the character groups. The calculated number of consecutive characters Ca is output from the continuous character counting unit 407 as data.

＜第２比較部＞
第２比較部４０８は、連続文字計数部４０７から出力された連続文字数Ｃａと、認識文字計数部４０４から出力された認識文字数Ｃｒとに基づき、これらの比Ｃａ／Ｃｒを算出する。その後、第２比較部４０８は、算出した比Ｃａ／Ｃｒが閾値Ｔｈ２より小さいか否かを判断する（図９のステップＳ２４）。ここで、閾値Ｔｈ２は、例えば０．４に設定される。 <Second comparison unit>
The second comparison unit 408 calculates the ratio Ca / Cr based on the number of consecutive characters Ca output from the consecutive character counting unit 407 and the number of recognized characters Cr output from the recognized character counting unit 404. After that, the second comparing unit 408 determines whether the calculated ratio Ca / Cr is smaller than the threshold Th2 (step S24 in FIG. 9). Here, the threshold value Th2 is set to, for example, 0.4.

＜第２判定部＞
第２判定部４０９は、第２比較部４０８の判断結果に基づき、文字原稿に記載れている文字が手書き文字であるのか或いは活字文字であるのかを判定する。具体的には、第２比較部４０８にて「比Ｃａ／Ｃｒが閾値Ｔｈ２より小さい」と判断された場合、第２判定部４０９は、「文字原稿に記載れている文字は手書き文字である」と判定する（図９のステップＳ２５）。一方、第２比較部４０８にて「比Ｃａ／Ｃｒが閾値Ｔｈ２より小さくない」と判断された場合、第２判定部４０９は、「文字原稿に記載れている文字は活字文字である」と判定する（図９のステップＳ２６）。 <Second determination unit>
The second determination unit 409 determines, based on the determination result of the second comparison unit 408, whether the character described in the text document is a handwritten character or a printed character. Specifically, when the second comparison unit 408 determines that “the ratio Ca / Cr is smaller than the threshold Th2”, the second determination unit 409 determines that the characters described in the character original are handwritten characters. (Step S25 in FIG. 9). On the other hand, when it is determined in the second comparison unit 408 that “the ratio Ca / Cr is not smaller than the threshold Th2”, the second determination unit 409 states that “the character described in the character original is a print character”. It determines (step S26 of FIG. 9).

活字文字の原稿では文字の形態（フォントやサイズ）が統一されていること多いため、認識文字数Ｃｒは大きな値となり、且つ、連続文字数Ｃａは認識文字数Ｃｒに近い値になり易い。従って、活字文字の原稿では、比Ｃａ／Ｃｒは１に近い値となる傾向にある。一方、手書き文字の原稿では、文字を書く人によって文字の形態は様々であり、又、文字のサイズが不揃いになり易い。このため、認識文字数Ｃｒは小さな値となり、且つ、連続文字数Ｃａは更に小さな値（０に近い値）となり易い。従って、手書き文字の原稿では、比Ｃａ／Ｃｒは０に近い値となる傾向にある。そして、この様な傾向に基づき閾値Ｔｈ２が適切な値に設定されることにより、比Ｃａ／Ｃｒと閾値Ｔｈ２との大小関係に基づいた判定が実現される。 Since the character form (font and size) is often unified in a print character document, the recognition character number Cr tends to be a large value, and the continuous character number Ca tends to be a value close to the recognition character number Cr. Therefore, the ratio Ca / Cr tends to be a value close to 1 in a print character document. On the other hand, in manuscripts of handwritten characters, the form of the characters varies depending on the person writing the characters, and the size of the characters tends to be irregular. Therefore, the recognition character number Cr tends to be a small value, and the continuous character number Ca is likely to be a further smaller value (value close to 0). Therefore, in the manuscript of handwritten characters, the ratio Ca / Cr tends to be a value close to zero. Then, by setting the threshold value Th2 to an appropriate value based on such a tendency, the determination based on the magnitude relationship between the ratio Ca / Cr and the threshold value Th2 is realized.

上述した第２の判定処理によれば、簡単な処理であるにも拘らず、原稿に記載されている文字が手書き文字であるのか否かの判定が高い精度で実行される。尚、第２の判定処理は、日本語に限らず、外国語にも適用することが可能である。 According to the second determination process described above, in spite of the simple process, it is possible to determine with high accuracy whether or not the character described in the document is a handwritten character. The second determination process can be applied not only to Japanese but also to foreign languages.

上述した第２の判定処理は、画像処理装置に判定処理プログラムを実行させることにより、実現されてもよい。そして、その様な判定処理プログラムは、読み取り可能な状態で記録媒体（ハードディスクやメモリカード等）に記録されていてもよい。 The second determination process described above may be realized by causing the image processing apparatus to execute a determination process program. Then, such a determination processing program may be recorded on a recording medium (hard disk, memory card, etc.) in a readable state.

［３］第３実施形態
［３−１］文字判定部の構成及び判定処理方法（第３の判定処理）
文字判定部２０４ｂにて実行される第３の判定処理について、具体的に説明する。この第３の判定処理は、ＲＧＢ画像データに含まれる直線を構成する画像（以下、「直線画像」と称す。）を検出し、その直線画像に基づき、文字原稿に記載されている文字が手書き文字であるか否かを判定する処理である。 [3] Third Embodiment [3-1] Configuration of Character Determination Unit and Determination Method (Third Determination Process)
The third determination process executed by the character determination unit 204b will be specifically described. In the third determination process, an image (hereinafter referred to as a "linear image") constituting a straight line included in RGB image data is detected, and a character described in the character original is handwritten based on the linear image. It is processing to determine whether or not it is a character.

文字判定部２０４ｂにて第３の判定処理が実行される場合、図１１及び図１２に示される流れに従って、文字判定部２０４ｂでは、エッジ抽出部４０１、直線検出部４１０、直線間隔算出部４１１、第３比較部４１２、及び第３判定部４１３の各々が処理を実行する。尚、エッジ抽出部４０１が実行する処理（図１２のステップＳ３１）については、第１実施形態にて説明した通りである。 When the third determination process is executed by the character determination unit 204b, according to the flow shown in FIGS. 11 and 12, the edge determination unit 401, the straight line detection unit 410, the linear interval calculation unit 411, and the like in the character determination unit 204b. Each of the third comparison unit 412 and the third determination unit 413 executes a process. The process (step S31 in FIG. 12) executed by the edge extraction unit 401 is as described in the first embodiment.

＜直線検出部＞
直線検出部４１０は、エッジ抽出部４０１にて抽出されたエッジ画素に基づき、ＲＧＢ画像データに含まれる直線画像のエッジに対応する直線Ｌ（ｊ）を検出する（図１２のステップＳ３２）。この様な直線Ｌ（ｊ）の検出には、ハフ変換が用いられる。 <Line detection unit>
The straight line detection unit 410 detects a straight line L (j) corresponding to the edge of the straight line image included in the RGB image data based on the edge pixel extracted by the edge extraction unit 401 (step S32 in FIG. 12). The Hough transform is used to detect such a straight line L (j).

具体的には、直線検出部４１０は、２次元直交座標系（ｘ−ｙ座標系）での各エッジ画素の座標（ｘ０，ｙ０）に基づいて、下記の式（２）で表現される極座標系（ｒ−θ座標系）での曲線を生成する。ここで、変数ｒ及びθは、ｘ−ｙ座標系において図１３（ａ）に示される様に規定されたものであり、ｘ−ｙ座標系での直線は下記の式（３）で表現される。即ち、式（２）で表現される曲線上の座標（ｒ，θ）は、式（３）を通じて、座標（ｘ０，ｙ０）を通る傾きの異なった全ての直線を表すことになる。 Specifically, the straight line detection unit 410 sets polar coordinates expressed by the following equation (2) based on the coordinates (x0, y0) of each edge pixel in the two-dimensional orthogonal coordinate system (xy coordinate system). Generate a curve in a system (r-θ coordinate system). Here, the variables r and θ are defined as shown in FIG. 13A in the xy coordinate system, and the straight line in the xy coordinate system is expressed by the following equation (3) Ru. That is, the coordinates (r, θ) on the curve expressed by the equation (2) represent all straight lines having different slopes passing through the coordinates (x0, y0) through the equation (3).

次に、直線検出部４１０は、全てのエッジ画素の座標からそれぞれ生成された式（２）の曲線について、それらの交点を抽出する。ここで、図１３（ｂ）に示される様に、ｎ個の曲線が１点（ｒ０，θ０）で交わっていれば、それらの曲線にそれぞれ対応するｎ個のエッジ画素は、下記の式（４）で表現される１つの直線上に位置することになる。 Next, the straight line detection unit 410 extracts the intersections of the curves of Formula (2) generated respectively from the coordinates of all edge pixels. Here, as shown in FIG. 13B, if n curves intersect at one point (r0, θ0), n edge pixels respectively corresponding to those curves have the following formula ( It will be located on one straight line expressed by 4).

この様に、直線検出部４１０は、曲線の交点を抽出することにより、その交点に対応した直線を抽出することができる。尚、直線検出部４１０は、複数の曲線が１点で交わって形成された交点を抽出する場合に限らず、複数の交点が密集している場合には、それらの交点のうち所定範囲（バラつきを考慮した範囲）に含まれたものを、１つの直線に対応した交点として抽出する。 Thus, the straight line detection unit 410 can extract a straight line corresponding to the intersection point by extracting the intersection point of the curve. The straight line detection unit 410 is not limited to the case where a plurality of curved lines cross each other at one point and is not limited to the case of extracting a plurality of intersecting points. ) Is extracted as an intersection point corresponding to one straight line.

ここで、エッジ画素には、罫線に対応した直線画像のエッジを構成するものに限らず、文字等の画像のエッジを構成するものも含まれている。よって、交点には、直線画像とは全く関係のない幾つかのエッジ画素（たまたま１つの直線上に位置しているだけのエッジ画素）に対応した曲線が交わって形成された交点も含まれる。そこで、直線検出部４１０は、この様な交点を除外する必要がある。 Here, the edge pixels include not only those constituting the edge of the straight line image corresponding to the ruled lines but also those constituting the edge of the image such as characters. Therefore, the intersections include intersections formed by intersections of curves corresponding to several edge pixels (incidentally, edge pixels that are merely located on one straight line) that have nothing to do with the straight line image. Therefore, the straight line detection unit 410 needs to exclude such an intersection point.

具体的には、直線検出部４１０は、交点が密集した領域を対象として、それらの領域の各々に上記所定範囲を設定する。そして、直線検出部４１０は、各領域の所定範囲に含まれた交点を形成している曲線の数（又は、それらの曲線にそれぞれ対応するエッジ画素の数）をカウントする。これにより、直線検出部４１０は、密集した領域の各々について曲線の総数（曲線数）を算出する。その後、直線検出部４１０は、算出した曲線数が閾値より大きいか否かを判断する。そして、「曲線数が閾値より大きい」との判断結果が得られた場合、直線検出部４１０は、それらの曲線にそれぞれ対応するエッジ画素を、１つの画素群Ｂ（ｊ）（符号ｊは、画素群を区別する番号を表す。）として抽出する。 Specifically, the straight line detection unit 410 sets the above-described predetermined range in each of the areas where the intersections are densely targeted. Then, the straight line detection unit 410 counts the number of curves forming the intersection included in the predetermined range of each region (or the number of edge pixels respectively corresponding to those curves). Thus, the straight line detection unit 410 calculates the total number of curves (the number of curves) for each of the dense regions. Thereafter, the straight line detection unit 410 determines whether the calculated number of curves is larger than a threshold. Then, when the determination result that “the number of curves is larger than the threshold value” is obtained, the straight line detection unit 410 detects edge pixels respectively corresponding to those curves as one pixel group B (j) (symbol j is Represents a number that distinguishes pixel groups).

その後、直線検出部４１０は、抽出した画素群Ｂ（ｊ）の各々について、その画素群に含まれる複数のエッジ画素の座標に基づいて、１つの直線Ｌ（ｊ）を表す下記の式（５）を算出する。ここで、式（５）の係数ａ（ｊ）、ｂ（ｊ）、及びｃ（ｊ）に基づいて算出される｛−ａ（ｊ）／ｂ（ｊ）｝及び｛−ｃ（ｊ）／ｂ（ｊ）｝がそれぞれ、ｘ−ｙ座標系での直線Ｌ（ｊ）の傾き及びｙ切片を表す。直線Ｌ（ｊ）の傾き及びｙ切片の算出には、例えば、最小二乗法等の近似法が用いられる。 Thereafter, for each of the extracted pixel groups B (j), the straight line detection unit 410 expresses the following equation (5) representing one straight line L (j) based on the coordinates of a plurality of edge pixels included in the pixel group. Calculate). Here, {−a (j) / b (j)} and {−c (j) / calculated based on coefficients a (j), b (j), and c (j) of equation (5) b (j)} represents the slope and y-intercept of the straight line L (j) in the xy coordinate system, respectively. For example, an approximation method such as the least square method is used to calculate the slope and the y intercept of the straight line L (j).

＜直線間隔算出部＞
直線間隔算出部４１１は、直線検出部４１０にて検出された直線Ｌ（ｊ）に基づき、隣接する２つの直線Ｌ（ｋ）及びＬ（ｋ＋１）の間隔ｄ（ｋ）を、下記の式（６）に従って算出する（図１２のステップＳ３３）。この式（６）は、図１４に示される様に、直線Ｌ（ｋ）上に位置する任意の座標（ｘ０（ｋ），ｙ０（ｋ））からの直線Ｌ（ｋ＋１）への垂線の長さを、間隔ｄ（ｋ）として表している。 <Linear interval calculation unit>
Based on the straight line L (j) detected by the straight line detection unit 410, the straight line distance calculation unit 411 sets the distance d (k) between two adjacent straight lines L (k) and L (k + 1) to Calculate according to 6) (step S33 in FIG. 12). This equation (6) is, as shown in FIG. 14, the length of the perpendicular to the straight line L (k + 1) from any coordinate (x0 (k), y0 (k)) located on the straight line L (k) Is expressed as an interval d (k).

好ましくは、直線間隔算出部４１１は、直線検出部４１０にて検出された直線Ｌ（ｊ）のうち、罫線に対応している可能性の高いものを抽出し、抽出した直線について間隔ｄ（ｋ）を算出する。これにより、後述する第３比較部４１２での比較精度が向上することになる。一例として、直線間隔算出部４１１は、直線Ｌ（ｊ）のうち、直線の長さが所定の長さ以上であるものを抽出する。所定の長さとして、例えば、文字原稿のサイズ（横書きである場合には横幅、縦書きである場合には縦幅）の所定割合（例えば、７０％）である長さが採用される。 Preferably, among the straight lines L (j) detected by the straight line detection unit 410, the straight line distance calculation unit 411 extracts one having a high probability of corresponding to the ruled line, and the distance d (k Calculate). This improves the comparison accuracy in the third comparison unit 412 described later. As an example, the straight line interval calculation unit 411 extracts, of the straight lines L (j), one having a straight line length equal to or greater than a predetermined length. As the predetermined length, for example, a length that is a predetermined ratio (for example, 70%) of the size of the character original (horizontal width in horizontal writing, vertical width in vertical writing) is employed.

＜第３比較部＞
第３比較部４１２は、直線間隔算出部４１１にて算出された間隔ｄ（ｋ）に基づき、直線検出部４１０にて検出された直線Ｌ（ｊ）が罫線に対応するものか否かを判断する（図１２のステップＳ３４）。 <Third comparison unit>
The third comparison unit 412 determines whether the straight line L (j) detected by the straight line detection unit 410 corresponds to a ruled line, based on the interval d (k) calculated by the straight line interval calculation unit 411. (Step S34 in FIG. 12).

具体的には、第３比較部４１２は、間隔ｄ（ｋ）の平均値を算出し、その平均値からの間隔ｄ（ｋ）の振れ幅が所定の誤差範囲内（例えば、±０．０３ｍｍ。解像度が６００ｄｐｉの場合、画素数にして約１画素分）にあるか否かを判断する。 Specifically, the third comparison unit 412 calculates an average value of the intervals d (k), and the fluctuation width of the intervals d (k) from the average value is within a predetermined error range (for example, ± 0.03 mm) If the resolution is 600 dpi, it is determined whether or not the number of pixels is about 1).

そして、「振れ幅は所定の誤差範囲内にある」との判断結果が得られた場合、第３比較部４１２は、間隔ｄ（ｋ）（例えば、平均値）が、既存の原稿に付されている罫線の間隔に、所定の誤差範囲内（例えば、±０．０３ｍｍ。解像度が６００ｄｐｉの場合、画素数にして約１画素分）で一致するか否かを判断する。罫線の間隔は、例えば、記憶部６に格納されており、第３比較部４１２は、必要に応じて記憶部６から罫線の間隔を読み出す。既存の罫線として、以下の様なものが存在する。尚、以下に括弧書きで示す数値は、罫線の間隔と、これに対応する画素数（解像度が６００ｄｐｉのときのもの）である。 Then, when the determination result that “the swing width is within the predetermined error range” is obtained, the third comparison unit 412 adds an interval d (k) (for example, an average value) to the existing document. It is determined whether or not the intervals between the ruled lines coincide with each other within a predetermined error range (for example, ± 0.03 mm; in the case of a resolution of 600 dpi, about one pixel in terms of the number of pixels). The interval between the ruled lines is stored, for example, in the storage unit 6, and the third comparison unit 412 reads the interval between the ruled lines from the storage unit 6 as necessary. The following exist as existing ruled lines. The numerical values shown in parentheses below are the interval between the ruled lines and the number of pixels corresponding to this (when the resolution is 600 dpi).

既存の罫線として、ノートや便箋に付されたものが存在する。ノート罫線としては、日本では、Ａ罫（７ｍｍ（１６８画素））、Ｂ罫（６ｍｍ（１４４画素））、Ｃ罫（５ｍｍ（１２０画素））、Ｕ罫（９ｍｍ（２１６画素））等が存在する。又、欧米では、カレッジルール（７．１５ｍｍ（１７２画素））、スペルライト（８．２０ｍｍ（１９７画素））、ワイドルール（８．７７ｍｍ（２１１画素））、リーガルルール（８．９２ｍｍ（２１５画素））、グレッグルール（８．８０ｍｍ（２１１画素）））等が存在する。 As existing ruled lines, there are those attached to notes and stationery. In Japan, A-line (7 mm (168 pixels)), B-line (6 mm (144 pixels)), C-line (5 mm (120 pixels)), U-line (9 mm (216 pixels)), etc. exist as notebook ruled lines Do. In the West, college rules (7.15 mm (172 pixels)), spell lights (8.20 mm (197 pixels)), wide rules (8.77 mm (211 pixels)), legal rules (8.92 mm (215 pixels) ), Greg rule (8.80 mm (211 pixels)), and the like.

＜第３判定部＞
第３判定部４１３は、第３比較部４１２の判断結果に基づき、文字原稿に記載れている文字が手書き文字であるのか否かを判定する。ここで、第３比較部４１２にて「間隔ｄ（ｋ）は、既存の原稿に付されている罫線の間隔に一致している」と判断された場合、文字原稿には罫線が付されている可能性が高い。そして、罫線が付された原稿には、文字が手書きされている可能性が高い。従って、第３比較部４１２にて「間隔ｄ（ｋ）は、既存の原稿に付されている罫線の間隔に一致している」と判断された場合、第３判定部４１３は、「文字原稿に記載れている文字は手書き文字である」と判定する（図１２のステップＳ３５）。一方、第３比較部４１２にて「間隔ｄ（ｋ）は、既存の原稿に付されている罫線の間隔に一致していない」と判断された場合、第３判定部４１３は、「文字原稿に記載れている文字は手書き文字でない」と判定する（図１２のステップＳ３６）。尚、第３判定部４１３は、直線検出部４１０にて検出された直線Ｌ（ｊ）の本数を算出し、その本数をも考慮して、文字原稿に記載れている文字が手書き文字であるのか否かを判定してもよい。 <Third determination unit>
The third determination unit 413 determines, based on the determination result of the third comparison unit 412, whether the character described in the text document is a handwritten character. Here, when the third comparison unit 412 determines that “the interval d (k) matches the interval between the ruled lines attached to the existing document”, the character document is attached with a ruled line. There is a high possibility of Then, there is a high possibility that the characters are handwritten on the document having the ruled lines. Therefore, when it is determined in the third comparison unit 412 that “the interval d (k) matches the interval of the ruled lines attached to the existing document”, the third determination unit 413 determines that “the character document It is determined that the character described in is a handwritten character "(step S35 in FIG. 12). On the other hand, when it is determined in the third comparison unit 412 that “the interval d (k) does not match the interval of the ruled lines attached to the existing document”, the third determination unit 413 “the text document It is determined that the character described in is not a handwritten character "(step S36 in FIG. 12). The third determination unit 413 calculates the number of straight lines L (j) detected by the straight line detection unit 410, and in consideration of the number, the characters described in the character original are handwritten characters. It may be determined whether or not

上述した第３の判定処理によれば、簡単な処理であるにも拘らず、原稿に記載されている文字が手書き文字であるのか否かの判定が高い精度で実行される。尚、第３の判定処理は、日本語に限らず、外国語にも適用することが可能である。 According to the third determination process described above, in spite of the simple process, it is possible to determine with high accuracy whether the character described in the document is a handwritten character. The third determination process can be applied not only to Japanese but also to foreign languages.

上述した第３の判定処理は、画像処理装置に判定処理プログラムを実行させることにより、実現されてもよい。そして、その様な判定処理プログラムは、読み取り可能な状態で記録媒体（ハードディスクやメモリカード等）に記録されていてもよい。 The third determination process described above may be realized by causing the image processing apparatus to execute a determination process program. Then, such a determination processing program may be recorded on a recording medium (hard disk, memory card, etc.) in a readable state.

［４］他の実施形態
［４−１］第４実施形態
文字判定部２０４ｂにて実行される第４の判定処理について、具体的に説明する。この第４の判定処理は、ＲＧＢ画像データから認識される文字の総数に基づき、文字原稿に記載されている文字が手書き文字であるか否かを判定する処理である。 [4] Other Embodiments [4-1] Fourth Embodiment A fourth determination process executed by the character determination unit 204b will be specifically described. The fourth determination process is a process of determining whether the character described in the text document is a handwritten character based on the total number of characters recognized from the RGB image data.

文字判定部２０４ｂにて第４の判定処理が実行される場合、図１５に示される流れに従って、文字判定部２０４ｂでは、文字認識処理部４０２、認識文字計数部４０４、第４比較部４１４、及び第４判定部４１５の各々が処理を実行する。尚、文字認識処理部４０２及び認識文字計数部４０４が実行する処理については、第１実施形態にて説明した通りである。 When the fourth determination process is executed by the character determination unit 204b, according to the flow shown in FIG. 15, the character determination processing unit 402, the recognized character counting unit 404, the fourth comparison unit 414, and the like in the character determination unit 204b. Each of the fourth determination units 415 executes a process. The processes executed by the character recognition processing unit 402 and the recognized character counting unit 404 are as described in the first embodiment.

＜第４比較部＞
第４比較部４１４は、認識文字計数部４０４から出力された認識文字数Ｃｒと、文字認識処理部４０２のレイアウト解析部４０２ｃで得られた抽出文字の総数（抽出文字数Ｃｂ）とに基づき、これらの比Ｃｒ／Ｃｂを算出する。その後、第４比較部４１４は、算出した比Ｃｒ／Ｃｂが閾値Ｔｈ４より小さいか否かを判断する。 <Fourth Comparison Unit>
The fourth comparison unit 414 determines the number of characters Cr based on the number of recognized characters Cr output from the recognized character counting unit 404 and the total number of extracted characters (number of extracted characters Cb) obtained by the layout analysis unit 402 c of the character recognition processing unit 402. Calculate the ratio Cr / Cb. Thereafter, the fourth comparison unit 414 determines whether the calculated ratio Cr / Cb is smaller than a threshold Th4.

＜第４判定部＞
第４判定部４１５は、第４比較部４１４の判断結果に基づき、文字原稿に記載れている文字が手書き文字であるのか或いは活字文字であるのかを判定する。具体的には、第４比較部４１４にて「比Ｃｒ／Ｃｂが閾値Ｔｈ４より小さい」と判断された場合、第５判定部４１５は、「文字原稿に記載れている文字は手書き文字である」と判定する。一方、第４比較部４１４にて「比Ｃｒ／Ｃｂが閾値Ｔｈ４より小さくない」と判断された場合、第４判定部４１５は、「文字原稿に記載れている文字は活字文字である」と判定する。 <Fourth determination unit>
The fourth determination unit 415 determines, based on the determination result of the fourth comparison unit 414, whether the character described in the text document is a handwritten character or a printed character. Specifically, when the fourth comparison unit 414 determines that “the ratio Cr / Cb is smaller than the threshold Th4”, the fifth determination unit 415 determines that the character described in the character original is a handwritten character. It is determined that On the other hand, when the fourth comparison unit 414 determines that "the ratio Cr / Cb is not smaller than the threshold Th4", the fourth determination unit 415 states that "the characters described in the character original are print characters". judge.

認識文字数Ｃｒは、活字文字の原稿に比べて手書き文字の原稿の方が、著しく小さくなる傾向にある。なぜなら、文字を書く人によって文字の形態は様々であり、従って、文字を認識する精度が低下するからである。従って、比Ｃｒ／Ｃｂは、活字文字の原稿に比べて手書き文字の原稿の方が、著しく小さくなる傾向にある。そして、この様な傾向に基づき閾値Ｔｈ４が適切な値に設定されることにより、比Ｃｒ／Ｃｂと閾値Ｔｈ４との大小関係に基づいた判定が実現される。 The recognition character number Cr tends to be significantly smaller in the manuscript of handwritten characters than in the manuscript of print characters. The reason is that depending on the person writing the character, the form of the character varies, and therefore the accuracy in recognizing the character decreases. Therefore, the ratio Cr / Cb tends to be significantly smaller in the manuscript of handwritten characters as compared to the manuscript of print characters. Then, by setting the threshold value Th4 to an appropriate value based on such a tendency, the determination based on the magnitude relationship between the ratio Cr / Cb and the threshold value Th4 is realized.

上述した第４の判定処理によれば、簡単な処理であるにも拘らず、原稿に記載されている文字が手書き文字であるのか否かの判定が高い精度で実行される。尚、第４の判定処理は、日本語に限らず、外国語にも適用することが可能である。 According to the fourth determination process described above, in spite of the simple process, it is possible to determine with high accuracy whether the character described in the document is a handwritten character. The fourth determination process can be applied not only to Japanese but also to foreign languages.

［４−２］第５実施形態
上述した第１〜第４の判定処理は、各々が文字判定部２０４ｂにて個別に実行されてもよいし、それらの幾つか又は全てが組み合わされて文字判定部２０４ｂにて実行されてもよい。 [4-2] Fifth Embodiment Each of the first to fourth determination processes described above may be individually executed by the character determination unit 204b, and some or all of them may be combined to perform character determination. It may be executed by the unit 204b.

一例として、第１〜第４の判定処理の全てが組み合わされて文字判定部２０４ｂにて実行される場合、文字判定部２０４ｂは、図１６に示される様に構成される。尚、比較部４１６及び総合判定部４１７以外の各部が実行する処理については、第１〜第４実施形態にて説明した通りである。比較部４１６は、上述した第１比較部４０５、第２比較部４０８、第３比較部４１２、及び第４比較部４１４を全て含んだものである。 As an example, when all of the first to fourth determination processes are combined and executed by the character determination unit 204b, the character determination unit 204b is configured as shown in FIG. The processes executed by the units other than the comparison unit 416 and the comprehensive determination unit 417 are as described in the first to fourth embodiments. The comparison unit 416 includes all of the first comparison unit 405, the second comparison unit 408, the third comparison unit 412, and the fourth comparison unit 414 described above.

総合判定部４１７は、比較部４１６での複数の判断結果を総合的に考慮することにより、文字原稿に記載れている文字が手書き文字であるのか否かを判定する。具体的には、第１〜第４の判定処理により少なくとも１つの項目で「文字原稿に記載れている文字は手書き文字である」との判定結果が得られた場合、総合判定部４１７は、「文字原稿に記載れている文字は手書き文字である」との総合判定結果を出力する。尚、２つ又は３つの項目或いは全ての項目で「文字原稿に記載れている文字は手書き文字である」との判定結果が得られた場合に、総合判定部４１７は、「文字原稿に記載れている文字は手書き文字である」との総合判定結果を出力してもよい。 The comprehensive determination unit 417 determines whether the character described in the text document is a handwritten character by comprehensively considering the plurality of determination results in the comparison unit 416. Specifically, when the first to fourth determination processings show that the determination result that the characters described in the text document are handwritten characters in at least one item, the comprehensive determination unit 417 It outputs a comprehensive judgment result that "the characters described in the character original are handwritten characters". When the determination result that “the characters described in the text is a handwritten character” is obtained for two or three items or all the items, the comprehensive determination unit 417 “described in the text A comprehensive judgment result may be outputted that "the character is a handwritten character".

この判定処理によれば、簡単な処理であるにも拘らず、原稿に記載されている文字が手書き文字であるのか否かの判定が、より高い精度で実行される。尚、この判定処理は、日本語に限らず、外国語にも適用することが可能である。 According to this determination process, in spite of the simple process, it is determined with higher accuracy whether the character described in the document is a handwritten character. Note that this determination process is applicable not only to Japanese but also to foreign languages.

［４−３］第６実施形態
上述した画像処理装置の原稿自動判定部２０４（原稿判定部２０４ａ及び文字判定部２０４ｂ）にて実行される原稿判定処理及び文字判定処理は、画像形成装置の動作モードとしてコピーモードが選択されている場合に限らず、原稿の読取りが必要な他の動作モード（イメージ送信モードやファクシミリ送受信モード等）が選択されている場合にも適用することができる。 [4-3] Sixth Embodiment The document determination process and the character determination process executed by the automatic document determination unit 204 (the document determination unit 204a and the character determination unit 204b) of the image processing apparatus described above are the operations of the image forming apparatus. The present invention is applicable not only to the case where the copy mode is selected as the mode, but also to the case where another operation mode (image transmission mode, facsimile transmission / reception mode, etc.) requiring reading of an original is selected.

そして、他の動作モードにおいても、画像入力装置１による原稿の読取り後、画像入力装置１にて生成されたＲＧＢアナログ信号に基づいた画像処理が、画像処理装置２にて実行される。以下では、図１７を参照して、イメージ送信モード選択時に画像処理装置２にて実行される処理について説明する。 Then, even in another operation mode, after the document is read by the image input device 1, the image processing device 2 executes image processing based on the RGB analog signal generated by the image input device 1. In the following, with reference to FIG. 17, the process executed by the image processing apparatus 2 when the image transmission mode is selected will be described.

≪画像処理装置（イメージ送信モード選択時の処理）≫
画像処理装置２では、Ａ／Ｄ変換部２０１、シェーディング補正部２０２、入力処理部２０３、原稿自動判定部２０４、膨張処理部２０５、領域分離処理部２０６、色補正部２０７、空間フィルタ部２０９、出力階調補正部２１０、及びフォーマット化処理部２１２の各々が処理を実行する。尚、色補正部２０７、空間フィルタ部２０９、出力階調補正部２１０、及びフォーマット化処理部２１２以外の各部が実行する処理は、コピーモード選択時の処理と同じであるので、説明を省略する。 << Image processing device (processing when selecting image transmission mode) >>
In the image processing apparatus 2, an A / D conversion unit 201, a shading correction unit 202, an input processing unit 203, an automatic document determination unit 204, an expansion processing unit 205, an area separation processing unit 206, a color correction unit 207, a spatial filter unit 209, Each of the output tone correction unit 210 and the formatting processing unit 212 executes processing. The processing performed by each unit other than the color correction unit 207, the spatial filter unit 209, the output tone correction unit 210, and the formatting processing unit 212 is the same as the processing at the time of selecting the copy mode, and thus the description is omitted. .

＜色補正部＞
色補正部２０７は、ＲＧＢ画像データ（膨張処理部２０５にて膨張処理が施された場合には膨張処理後のＲＧＢ画像データ）について、色空間を、ＲＧＢ空間から、一般的に普及した表示装置の表示特性に適合した色空間（例えば、ｓＲＧＢ色空間）に変換する。変換後の画像データは、黒生成／下色除去部２０８での処理を受けずに、そのまま空間フィルタ部２０９に入力される。以下では、この色空間の変換により生成される画像データを「Ｒ’Ｇ’Ｂ’画像データ」と称す。 <Color correction unit>
The color correction unit 207 is a display device in which the color space is generally spread from the RGB space for RGB image data (RGB image data after expansion processing when expansion processing is performed by the expansion processing unit 205) Convert to a color space (for example, sRGB color space) adapted to the display characteristics of The image data after conversion is input to the spatial filter unit 209 as it is without being subjected to the processing in the black generation / under color removal unit 208. Hereinafter, the image data generated by the conversion of the color space is referred to as "R'G'B 'image data".

＜空間フィルタ部＞
空間フィルタ部２０９は、色補正部２０７にて生成されたＲ’Ｇ’Ｂ’画像データに対して、強調処理や平滑化処理等の処理を施す。このとき、原稿自動判定部２０４及び領域分離処理部２０６での判別結果（原稿判別データ及び領域分離データ）が、必要に応じて使用される。 <Space filter section>
The spatial filter unit 209 subjects the R′G′B ′ image data generated by the color correction unit 207 to processing such as enhancement processing and smoothing processing. At this time, determination results (original determination data and area separation data) in the original document automatic determination unit 204 and the area separation processing unit 206 are used as needed.

＜出力階調補正部＞
出力階調補正部２１０は、Ｒ’Ｇ’Ｂ’画像データに対して、画像が適切な明るさを持つことができる様に、出力γ補正処理を施す。処理後のＲ’Ｇ’Ｂ’画像データは、中間調生成部２１１での処理を受けずに、そのままフォーマット化処理部２１２に入力される。 <Output tone correction unit>
The output gradation correction unit 210 performs output γ correction processing on the R′G′B ′ image data so that the image can have appropriate brightness. The processed R′G′B ′ image data is input to the formatting processing unit 212 as it is without being subjected to the processing by the halftone generation unit 211.

＜フォーマット化処理部＞
フォーマット化処理部２１２は、出力階調補正部２１０による処理後のＲ’Ｇ’Ｂ’画像データについて、そのファイル形式をＰＤＦ（Portable Document Format）に変換する。 <Formatting Processing Unit>
The formatting processing unit 212 converts the file format of the R′G′B ′ image data processed by the output tone correction unit 210 into PDF (Portable Document Format).

≪送受信装置≫
送受信装置５は、ネットワークを介して画像データの送受信を行う。具体的には、送受信装置５は、パーソナルコンピュータ等の外部接続装置から画像データを受信する機能と、外部接続装置へ画像データを送信する機能とを備える。送受信装置５が受信した画像データは、例えば、記憶部６に格納される。 << Transceiver device >>
The transmitting and receiving device 5 transmits and receives image data via the network. Specifically, the transmitting and receiving device 5 has a function of receiving image data from an external connection device such as a personal computer and a function of transmitting image data to the external connection device. The image data received by the transmission / reception device 5 is stored, for example, in the storage unit 6.

送信の例として、送受信装置５は、フォーマット化処理部２１２にて生成されたＰＤＦファイルを、図示しないメール処理部にて電子メールに添付された状態で、ネットワーク網や通信回線を通じて外部接続装置へ送信する。 As an example of transmission, the transmitting / receiving device 5 sends the PDF file generated by the formatting processing unit 212 to the external connection device through the network or communication line in a state where the PDF file is attached to the e-mail by the mail processing unit (not shown). Send.

［４−４］第７実施形態
上述した画像処理装置の原稿自動判定部２０４（原稿判定部２０４ａ及び文字判定部２０４ｂ）にて実行される原稿判定処理及び文字判定処理は、複合機に限らず、スキャナ等のカラー画像読取装置にも適用することができる。 [4-4] Seventh Embodiment The document determination process and the character determination process performed by the automatic document determination unit 204 (the document determination unit 204a and the character determination unit 204b) of the image processing apparatus described above are not limited to the multifunction device. The present invention is also applicable to a color image reading apparatus such as a scanner.

具体的には、図１８に示される様に、カラー画像読取装置は、画像入力装置１、画像処理装置２、記憶部６、及び制御部７を備える。そして、画像処理装置２では、Ａ／Ｄ変換部２０１、シェーディング補正部２０２、入力処理部２０３、原稿自動判定部２０４、膨張処理部２０５、色補正部２０７、及びフォーマット化処理部２１２の各々が処理を実行する。各部で実行される処理の詳細については、上記第１実施形態及び第６実施形態にて説明した通りである。 Specifically, as shown in FIG. 18, the color image reading apparatus includes an image input device 1, an image processing device 2, a storage unit 6, and a control unit 7. In the image processing apparatus 2, each of the A / D conversion unit 201, the shading correction unit 202, the input processing unit 203, the automatic document determination unit 204, the expansion processing unit 205, the color correction unit 207, and the formatting processing unit 212 Execute the process The details of the process executed by each unit are as described in the first embodiment and the sixth embodiment.

フォーマット化処理部２１２にて生成されたＰＤＦファイルは、画像処理装置２から出力される。そして、出力された画像データは、ネットワーク網や通信回線を通じて外部接続装置へ送信される。 The PDF file generated by the formatting processing unit 212 is output from the image processing apparatus 2. Then, the output image data is transmitted to the external connection device through the network or the communication line.

上述の実施形態の説明は、すべての点で例示であって、制限的なものではないと考えられるべきである。本発明の範囲は、上述の実施形態ではなく、特許請求の範囲によって示される。更に、本発明の範囲には、特許請求の範囲と均等の意味及び範囲内での全ての変更が含まれることが意図される。 The above description of the embodiments should be considered in all respects as illustrative and not restrictive. The scope of the present invention is indicated not by the embodiments described above but by the claims. Further, the scope of the present invention is intended to include all modifications within the scope and meaning equivalent to the claims.

２０４原稿自動判定部
２０４ａ原稿判定部
２０４ｂ文字判定部
２０５膨張処理部
４０１エッジ抽出部
４０２文字認識処理部
４０２ａ信号変換部
４０２ｂ２値化処理部
４０２ｃレイアウト解析部
４０２ｄ文字認識部
４０３エッジ計数部
４０４認識文字計数部
４０５第１比較部
４０６第１判定部
４０７連続文字計数部
４０８第２比較部
４０９第２判定部
４１０直線検出部
４１１直線間隔算出部
４１２第３比較部
４１３第３判定部
４１４第４比較部
４１５第４判定部
４１６比較部
４１７総合判定部
Ｃａ連続文字数
Ｃｂ抽出文字数
Ｃｅエッジ画素数
Ｃｒ認識文字数
Ｔｈ１、Ｔｈ２、Ｔｈ４閾値 204 automatic document determination unit 204a document determination unit 204b character determination unit 205 expansion processing unit 401 edge extraction unit 402 character recognition processing unit 402a signal conversion unit 402b binarization processing unit 402c layout analysis unit 402d character recognition unit 403 edge counting unit 404 recognition Character counting unit 405 First comparing unit 406 First judging unit 407 Continuous character counting unit 408 Second comparing unit 409 Second judging unit 410 Straight line detecting unit 411 Straight interval calculating unit 412 Third comparing unit 413 Third judging unit 414 Fourth Comparison unit 415 Fourth determination unit 416 Comparison unit 417 General determination unit Ca Number of consecutive characters Cb Extraction number of characters Ce Edge pixel number Cr Recognition number of characters Th1, Th2, Th4 threshold

Claims

A character recognition processing unit that recognizes characters included in image data of a document;
A recognition character counting unit that calculates the total number of the recognized characters by counting the number of characters recognized by the character recognition processing unit;
A determination unit that determines whether the character described in the document is a handwritten character based on the total number calculated by the recognized character counting unit;
An image processing apparatus comprising:

An edge extraction unit which extracts edge pixels constituting an edge of an image included in the image data;
An edge counting unit that calculates the total number of edge pixels by counting the number of edge pixels extracted by the edge extracting unit;
A first comparison unit that determines whether a first ratio, which is a ratio of the total number of edge pixels to the total number calculated by the recognized character counting unit, is greater than a first threshold;
And further
The determination unit determines that the character described in the document is the handwritten character when the first comparison unit determines that the first ratio is larger than the first threshold. The image processing apparatus according to claim 1.

The character recognition processing in the recognized character in part, by the size counts the number of Rubun shaped to continuously be determined and more than a predetermined one and the same continuous character count for calculating the total number of consecutive characters Department,
A second comparator for determining whether the second rate Ru Oh the ratio of the total number of characters said consecutive with respect to the total calculated by said recognized character counting section is smaller than the second threshold value,
And further
The determination unit determines that the character described in the document is the handwritten character when the second comparison unit determines that the second ratio is smaller than the second threshold. The image processing apparatus according to claim 1.

An edge extraction unit that extracts edge pixels constituting an edge of an image included in image data of a document;
A straight line detection unit that detects a straight line corresponding to an edge of an image forming a straight line included in the image data, based on the edge pixel extracted by the edge extraction unit;
A linear interval calculation unit that calculates an interval of the straight lines detected by the straight line detection unit;
A determination unit that determines whether the character described in the document is a handwritten character based on the interval calculated by the linear interval calculation unit;
An image processing apparatus comprising:

The linear interval calculation unit is configured to extract one of the straight lines detected by the straight line detection unit that has a length equal to or greater than a predetermined length, and calculates the interval for the extracted straight line. Image processing device.

The expansion processing unit according to any one of claims 1 to 5, further comprising: an expansion processing unit that performs expansion processing on the image data when the determination unit determines that the character described in the document is the handwritten character. The image processing device according to one.

In the image processing device,
Character recognition processing for recognizing characters contained in image data of a document;
Recognition character counting processing for calculating the total number of the recognized characters by counting the number of characters recognized by the character recognition processing;
A determination process of determining whether the character described in the document is a handwritten character based on the total number calculated by the recognized character counting process;
An image processing program that executes

In the image processing device,
Edge extraction processing for extracting edge pixels constituting an edge of an image included in image data of a document;
Straight line detection processing for detecting a straight line corresponding to an edge of an image constituting a straight line included in the image data, based on the edge pixel extracted by the edge extraction processing;
A straight line interval calculation process for calculating the interval of the straight lines detected by the straight line detection process;
A determination process of determining whether the character described in the document is a handwritten character based on the interval calculated by the linear interval calculation process;
An image processing program that executes