JP7779130B2

JP7779130B2 - Image recognition device and image recognition method

Info

Publication number: JP7779130B2
Application number: JP2021207145A
Authority: JP
Inventors: 卓児玉
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2021-12-21
Filing date: 2021-12-21
Publication date: 2025-12-03
Anticipated expiration: 2041-12-21
Also published as: JP2023092134A

Description

本発明は、画像認識装置、および画像認識方法に関する。 The present invention relates to an image recognition device and an image recognition method.

ＯＣＲ（Optical Character Reader）技術を用いて、記録媒体に形成された文字や図形の画像を認識する画像認識装置が知られている。 Image recognition devices are known that use OCR (Optical Character Reader) technology to recognize images of characters and figures formed on recording media.

上記の画像認識装置として、認識対象とする帳票画像を定義する際に、帳票画像に含まれる項目と、項目に対する選択肢と、を含む領域の指定を受け付け、領域から抽出された、項目と選択肢とを対応付けた定義を表示するものが開示されている（例えば、特許文献１参照）。 One such image recognition device has been disclosed that, when defining a form image to be recognized, accepts the specification of an area containing items and options for the items included in the form image, and displays a definition that associates the items with the options extracted from the area (see, for example, Patent Document 1).

画像認識装置では、帳票画像等の入力画像に含まれ、項目に対する選択を受け付ける選択受付図形の認識精度に優れたものが求められる。 Image recognition devices are required to have excellent recognition accuracy for selection acceptance figures that are included in input images such as form images and that accept selections for items.

本発明は、入力画像に含まれる選択受付図形の認識精度に優れた画像認識装置を提供することを目的とする。 The present invention aims to provide an image recognition device with excellent recognition accuracy for selection acceptance figures contained in input images.

本発明の一態様に係る画像認識装置は、原稿の少なくとも一部が入力された入力画像に含まれ、項目に対する選択を受け付ける選択受付図形を認識可能な画像認識装置であって、前記入力画像に含まれる図形の外形形状の特徴点に基づき、前記選択受付図形の頂点候補の検出結果を出力する頂点候補出力部と、前記頂点候補出力部から受け取った前記頂点候補の検出結果に基づき、前記頂点候補が前記選択受付図形の頂点候補であるか否かを判定する頂点候補評価部と、前記頂点候補評価部による判定結果に基づき、前記選択受付図形の認識結果を出力する認識結果出力部と、を有し、前記頂点候補評価部は、複数の前記頂点候補同士を結ぶ直線が着色しているか否かを判定すること、又は前記頂点候補同士の間の距離と、前記入力画像に含まれる文字のサイズと、を比較することで、前記頂点候補が前記選択受付図形の頂点候補であるか否かを判定する。
An image recognition device according to one aspect of the present invention is an image recognition device capable of recognizing a selection acceptance figure that is included in an input image into which at least a portion of a manuscript is input, and that accepts selections for items.The image recognition device has: a vertex candidate output unit that outputs a detection result of a vertex candidate of the selection acceptance figure based on characteristic points of the external shape of the figure included in the input image; a vertex candidate evaluation unit that determines whether the vertex candidate is a vertex candidate of the selection acceptance figure based on the detection result of the vertex candidate received from the vertex candidate output unit; and a recognition result output unit that outputs a recognition result of the selection acceptance figure based on the determination result by the vertex candidate evaluation unit.The vertex candidate evaluation unit determines whether the vertex candidate is a vertex candidate of the selection acceptance figure by determining whether a straight line connecting multiple vertex candidates is colored, or by comparing the distance between the vertex candidates with the size of the characters included in the input image .

本発明によれば、入力画像に含まれる選択受付図形の認識精度に優れた画像認識装置を提供できる。 This invention provides an image recognition device with excellent recognition accuracy for selection acceptance figures contained in input images.

実施形態に係る画像認識装置の入力画像を例示する図である。1 is a diagram illustrating an example of an input image of an image recognition device according to an embodiment; 実施形態に係る画像認識装置の出力画像を例示する図である。1 is a diagram illustrating an example of an output image of an image recognition device according to an embodiment; 実施形態に係る画像認識装置を有する画像形成装置の構成例の図である。1 is a diagram illustrating an example of the configuration of an image forming apparatus including an image recognition device according to an embodiment. 実施形態に係る画像形成装置の構成例のブロック図である。1 is a block diagram of an example of the configuration of an image forming apparatus according to an embodiment; 実施形態に係るコントローラのハードウェア構成例のブロック図である。FIG. 2 is a block diagram of an example of the hardware configuration of a controller according to the embodiment. 実施形態に係るＩＰＵのハードウェア構成例のブロック図である。FIG. 2 is a block diagram of an example of the hardware configuration of an IPU according to an embodiment. 実施形態に係るＩＰＵの機能構成例のブロック図である。FIG. 2 is a block diagram illustrating an example of a functional configuration of an IPU according to an embodiment. 実施形態に係る画像形成装置による原稿読取の動作例のフロー図である。FIG. 4 is a flowchart of an example of an operation of reading a document by the image forming apparatus according to the embodiment. 実施形態に係るＩＰＵによる認識処理例のフロー図である。FIG. 10 is a flow diagram of an example of recognition processing by an IPU according to an embodiment. 実施形態に係る入力画像例を示す図である。FIG. 10 is a diagram illustrating an example of an input image according to the embodiment. 実施形態に係る入力画像のヒストグラムを例示する図である。FIG. 10 is a diagram illustrating a histogram of an input image according to the embodiment. 実施形態に係る二値化画像例を示す図である。FIG. 10 is a diagram illustrating an example of a binarized image according to the embodiment. 実施形態に係る外形形状画像例を示す図である。FIG. 10 is a diagram showing an example of an outer shape image according to the embodiment; 実施形態に係る頂点候補検出処理例を説明する第１図である。FIG. 1 is a first diagram illustrating an example of a vertex candidate detection process according to the embodiment. 実施形態に係る頂点候補検出処理例を説明する第２図である。FIG. 10 is a second diagram illustrating an example of a vertex candidate detection process according to the embodiment. 実施形態に係る頂点候補検出処理例を説明する第３図である。FIG. 3 is a third diagram illustrating an example of a vertex candidate detection process according to the embodiment. 実施形態に係る頂点候補評価処理例を説明する図である。10A to 10C are diagrams illustrating an example of a vertex candidate evaluation process according to the embodiment. 実施形態に係る出力画像の第１例を示す図である。FIG. 10 is a diagram illustrating a first example of an output image according to the embodiment. 実施形態に係る出力画像の第２例を示す図である。FIG. 10 is a diagram showing a second example of an output image according to the embodiment. 実施形態に係る傾き検出処理例を説明する図である。10A to 10C are diagrams illustrating an example of tilt detection processing according to an embodiment. 実施形態に係る色反転処理例を説明する図である。10A to 10C are diagrams illustrating an example of color inversion processing according to an embodiment. 実施形態に係る黒色領域の拡大処理例を説明する図である。10A to 10C are diagrams illustrating an example of a process for enlarging a black region according to an embodiment. その他の実施形態に係る出力画像の第１例を示す図である。FIG. 10 is a diagram showing a first example of an output image according to another embodiment. その他の実施形態に係るＩＰＵの機能構成例のブロック図である。FIG. 10 is a block diagram illustrating an example of the functional configuration of an IPU according to another embodiment. その他の実施形態に係る出力画像の第２例を示す図である。FIG. 10 is a diagram showing a second example of an output image according to another embodiment.

以下、図面を参照して本発明を実施するための形態について詳細に説明する。各図面において、同一構成部には同一符号を付し、重複した説明を適宜省略する。 The following describes in detail the embodiments of the present invention with reference to the drawings. In each drawing, the same components are designated by the same reference numerals, and duplicate explanations will be omitted where appropriate.

実施形態に係る画像認識装置は、原稿の少なくとも一部が入力された入力画像に含まれ、項目に対する選択を受け付ける選択受付図形を認識可能なものである。 The image recognition device according to the embodiment is capable of recognizing a selection acceptance graphic that accepts selections for items and that is included in an input image that contains at least a portion of a manuscript.

選択受付図形とは、選択の対象項目周辺に形成され、選択されたことを示すマークが入力される図形をいう。例えば、選択受付図形は、原稿に含まれる選択の対象項目周辺に形成された四角図形であって、選択された際に、選択されたことを示すチェックマークが付与される、いわゆるチェックボックス等である。対象項目とは、選択受付図形を用いて選択される項目を意味する。 A selection acceptance figure is a figure that is formed around an item to be selected and into which a mark indicating that it has been selected is entered. For example, a selection acceptance figure is a rectangular figure formed around an item to be selected that is included in a manuscript, such as a checkbox, into which a check mark indicating that it has been selected is added when the item is selected. A target item is an item that is selected using a selection acceptance figure.

実施形態では、一例として選択受付図形がチェックボックスである場合を説明する。 In this embodiment, we will explain the case where the selection acceptance shape is a check box as an example.

図１および図２は、実施形態に係る画像認識装置に対する入出力画像を例示する図であり、図１は入力画像Ｉｍ０、図２は出力画像Ｑを例示している。 Figures 1 and 2 are diagrams illustrating input and output images for an image recognition device according to an embodiment, with Figure 1 illustrating input image Im0 and Figure 2 illustrating output image Q.

図１に示すように、入力画像Ｉｍ０は、対象項目５１０の画像と、対象項目５１０に隣り合って形成されたチェックボックス５１１と、を含んでいる。 As shown in Figure 1, input image Im0 includes an image of a target item 510 and a check box 511 formed adjacent to the target item 510.

実施形態に係る画像認識装置は、入力画像Ｉｍ０の中からチェックボックス５１１を認識し、図２に示すように、認識したチェックボックス５１１を指示する認識マーク５２１を付与した出力画像Ｑを出力する。 The image recognition device according to the embodiment recognizes check boxes 511 from the input image Im0 and outputs an output image Q to which a recognition mark 521 indicating the recognized check box 511 has been added, as shown in FIG. 2.

実施形態に係る画像認識装置によりチェックボックス５１１を認識することによって、原稿Ｐｉにおけるチェックボックス５１１に、チェックマーク５２２が付与されているか否か等を、後段の処理装置が正確に認識可能になる。 By recognizing the check box 511 using the image recognition device according to the embodiment, a downstream processing device can accurately recognize whether or not a check mark 522 has been added to the check box 511 in the document Pi.

実施形態に係る画像認識装置は、入力画像Ｉｍ０に含まれる図形の外形形状の特徴点に基づき、チェックボックス５１１の頂点候補の検出結果を出力する頂点候補出力部と、頂点候補出力部から出力される頂点候補の検出結果に基づき、チェックボックス５１１の認識結果を出力する認識結果出力部と、を有する。この構成により、入力画像Ｉｍ０に含まれるチェックボックス５１１の認識精度に優れた画像認識装置を提供する。ここで、頂点とは、チェックボックス５１１における図形の角部分をいう。 The image recognition device according to the embodiment includes a vertex candidate output unit that outputs the detection results of vertex candidates for the check box 511 based on feature points of the external shape of a figure included in the input image Im0, and a recognition result output unit that outputs the recognition results of the check box 511 based on the vertex candidate detection results output from the vertex candidate output unit. This configuration provides an image recognition device with excellent recognition accuracy for the check box 511 included in the input image Im0. Here, a vertex refers to a corner of the figure in the check box 511.

以下、実施形態に係る画像認識装置を有する画像形成装置を一例として説明する。この画像形成装置は、例えば用紙等の記録媒体に画像を形成する機能と、画像が形成された用紙等の原稿を読み取る機能と、を有する複合機（MFP：Multi-Function Peripheral）である。実施形態に係る画像認識装置は、画像形成装置により読み取られた原稿の画像の少なくとも一部が入力された入力画像Ｉｍ０から、チェックボックス５１１を認識する。 The following describes an image forming device having an image recognition device according to an embodiment as an example. This image forming device is a multi-function peripheral (MFP) that has the function of forming an image on a recording medium such as paper, and the function of reading an original document such as paper on which an image has been formed. The image recognition device according to an embodiment recognizes a check box 511 from an input image Im0, which contains at least a portion of an image of an original document read by the image forming device.

＜画像形成装置１の構成例＞
（全体構成例）
図３は、画像形成装置１の全体構成の一例を示す図である。図３に示すように、画像形成装置１は、スキャナ１００と、プロッタ３００と、給紙部４００と、を有する。 <Configuration example of image forming apparatus 1>
(Overall configuration example)
3 is a diagram showing an example of the overall configuration of the image forming apparatus 1. As shown in FIG.

給紙部４００は、サイズの異なる用紙を収納する給紙カセット４２１及び４２２と、給紙カセット４２１及び４２２に収納された用紙をプロッタ３００による画像形成位置まで搬送する各種ローラからなる給紙手段４２３と、を有する。 The paper feed unit 400 has paper feed cassettes 421 and 422 that store paper of different sizes, and a paper feed means 423 consisting of various rollers that transport the paper stored in the paper feed cassettes 421 and 422 to the image formation position by the plotter 300.

プロッタ３００は、露光装置３３１と、感光体ドラム３３２と、現像装置３３３と、転写ベルト３３４と、定着装置３３５と、を有する。プロッタ３００は、スキャナ１００により読取られた原稿の画像データ、或いはネットワークＩ／Ｆを介して外部のＰＣ等から入力された画像データに基づいて、露光装置３３１により感光体ドラム３３２を露光して感光体ドラム３３２に潜像を形成する。プロッタ３００は、現像装置３３３により感光体ドラム３３２に異なる色のトナーを供給して現像する。プロッタ３００は、転写ベルト３３４により感光体ドラム３３２に現像されたトナー像を給紙部４００から供給された用紙に転写した後、定着装置３３５により用紙に転写されたトナー像を構成するトナーを溶融させて、用紙にカラー画像を定着させる。 The plotter 300 has an exposure device 331, a photosensitive drum 332, a developing device 333, a transfer belt 334, and a fixing device 335. The plotter 300 exposes the photosensitive drum 332 using the exposure device 331 to form a latent image on the photosensitive drum 332 based on image data of an original scanned by the scanner 100 or image data input from an external PC or the like via a network I/F. The plotter 300 develops the image by supplying different color toners to the photosensitive drum 332 using the developing device 333. The plotter 300 transfers the toner image developed on the photosensitive drum 332 using the transfer belt 334 to paper supplied from the paper feed unit 400, and then the fixing device 335 melts the toner that makes up the toner image transferred to the paper, fixing the color image to the paper.

スキャナ１００は、ＡＤＦ（Auto Document Feeder）４１と、スキャナユニット４２と、排紙トレイ４３と、を有する。スキャナ１００は、ＡＤＦ４１を駆動させ、ＡＤＦ４１に載置された原稿をスキャナユニット４２に搬送する。スキャナ１００は、スキャナユニット４２を駆動させ、ＡＤＦ４１から搬送される原稿を撮像させる。ＡＤＦ４１に原稿が載置されておらず、スキャナユニット４２に直接原稿が載置された場合には、スキャナユニット４２は、載置された原稿を撮像する。すなわち、スキャナユニット４２が原稿の撮像部として動作する。 Scanner 100 has an ADF (Auto Document Feeder) 41, a scanner unit 42, and a paper output tray 43. Scanner 100 drives ADF 41 to transport a document placed on ADF 41 to scanner unit 42. Scanner 100 drives scanner unit 42 to capture an image of the document transported from ADF 41. If a document is not placed on ADF 41 and is placed directly on scanner unit 42, scanner unit 42 captures an image of the placed document. In other words, scanner unit 42 operates as a document imaging unit.

図３に示すスキャナユニット４２は、例えば、差動ミラー駆動方式であって、光学センサ一体型駆動方式のものである。スキャナユニット４２は、第１ミラーユニット２０４と、第２ミラーユニット２１０と、レンズ２１６と、第１センサボード２１５と、を有する。 The scanner unit 42 shown in Figure 3 is, for example, a differential mirror drive system with an integrated optical sensor drive system. The scanner unit 42 has a first mirror unit 204, a second mirror unit 210, a lens 216, and a first sensor board 215.

第１ミラーユニット２０４は、ＬＥＤ（Light Emitting Diode）等の発光部を含む光源と、ミラーと、を有する光学ユニットである。第１ミラーユニット２０４は、光源からの光を載置された原稿に照明し、原稿による反射光を第２ミラーユニット２１０に向けて反射する。 The first mirror unit 204 is an optical unit that includes a light source, such as an LED (Light Emitting Diode), and a mirror. The first mirror unit 204 illuminates the placed document with light from the light source and reflects the light reflected by the document toward the second mirror unit 210.

第２ミラーユニット２１０は、第１ミラーユニット２０４からの光をレンズ２１６に向けて反射する。レンズ２１６は、第２ミラーユニット２１０からの光を、第１センサボード２１５に設けられた光電変換素子２１４上において略結像させる。略結像した原稿の像は、光電変換素子２１４が光電変換してアナログ画像信号とすることにより、原稿の読み取りが行われる。 The second mirror unit 210 reflects the light from the first mirror unit 204 toward the lens 216. The lens 216 causes the light from the second mirror unit 210 to form an approximate image on the photoelectric conversion element 214 provided on the first sensor board 215. The photoelectric conversion element 214 photoelectrically converts the approximate image of the original document into an analog image signal, thereby reading the original document.

図４は、画像形成装置１の構成を例示するブロック図である。画像形成装置１は、スキャナ１００と、コントローラ２００と、プロッタ３００と、画像メモリ４０１と、ＩＰＵ（Image Processing Unit）５００と、を有する。 Figure 4 is a block diagram illustrating the configuration of the image forming device 1. The image forming device 1 includes a scanner 100, a controller 200, a plotter 300, an image memory 401, and an IPU (Image Processing Unit) 500.

画像形成装置１は、スキャナ１００により読み取った原稿Ｐｉの少なくとも一部が入力された入力画像Ｉｍ０に対し、ＩＰＵ５００により画像認識処理を行う。なお、ＩＰＵ５００は、画像認識処理以外の画像処理を行ってもよい。画像形成装置１は、ＩＰＵ５００による画像処理後の画像データや、テストチャート（テストパターン）の画像データに基づいて、プロッタ３００により用紙に画像形成し、画像形成された用紙を印刷物Ｐｏとして出力する。 The image forming device 1 uses the IPU 500 to perform image recognition processing on the input image Im0, which is at least a portion of the original document Pi scanned by the scanner 100. Note that the IPU 500 may also perform image processing other than image recognition processing. The image forming device 1 forms an image on paper using the plotter 300 based on the image data after image processing by the IPU 500 and image data of a test chart (test pattern), and outputs the paper with the image formed on it as a printout Po.

コントローラ２００は、画像形成装置１全体の制御を行う。 The controller 200 controls the entire image forming device 1.

画像メモリ４０１は、揮発性メモリや、ハードディスク等を含む。画像メモリ４０１は、スキャナ１００、コントローラ２００、プロッタ３００およびＩＰＵ５００のそれぞれの間において画像データの送受を行うための一次保管や、後日使用するための恒久保管を行う。 The image memory 401 includes volatile memory, a hard disk, etc. The image memory 401 provides temporary storage for sending and receiving image data between the scanner 100, controller 200, plotter 300, and IPU 500, as well as permanent storage for later use.

ＩＰＵ５００は、入力画像Ｉｍ０に含まれ、項目に対する選択を受け付けるチェックボックス５１１を認識可能な画像認識装置の一例である。 IPU500 is an example of an image recognition device that can recognize check boxes 511 included in input image Im0 and that accept selections for items.

画像形成装置１は、ネットワークＮＷを経由し、ＰＣ２やサーバー３等に接続し、画像データ等を送受信できる。 Image forming device 1 can connect to PC 2, server 3, etc. via network NW to send and receive image data, etc.

＜コントローラ２００のハードウェア構成例＞
図５は、コントローラ２００のハードウェア構成の一例を示すブロック図である。図５に示すように、コントローラ２００は、ＣＰＵ（Central Processing Unit）１０と、ＲＡＭ（Random Access Memory）１１と、ＲＯＭ（Read Only Memory）１２と、ＨＤＤ（Hard Disk Drive）１３と、通信Ｉ／Ｆ（Interface）１４と、を有する。これらはバス１５を介して相互に電気的に接続している。また、通信Ｉ／Ｆ１４は、表示装置１６及び入力操作部１７に接続している。 <Example of Hardware Configuration of Controller 200>
Fig. 5 is a block diagram showing an example of the hardware configuration of the controller 200. As shown in Fig. 5, the controller 200 has a CPU (Central Processing Unit) 10, a RAM (Random Access Memory) 11, a ROM (Read Only Memory) 12, a HDD (Hard Disk Drive) 13, and a communication I/F (Interface) 14. These are electrically connected to each other via a bus 15. The communication I/F 14 is also connected to a display device 16 and an input operation unit 17.

ＣＰＵ１０は演算手段であり、画像形成装置１全体の動作を制御する。ＲＡＭ１１は、情報の高速な読み書きが可能な揮発性の記憶媒体である。ＣＰＵ１０は、情報を処理する際の作業領域としてＲＡＭ１１を用いる。ＲＯＭ１２は、読み出し専用の不揮発性記憶媒体であり、ファームウェア等のプログラムを格納している。 The CPU 10 is a computing unit that controls the overall operation of the image forming apparatus 1. The RAM 11 is a volatile storage medium that allows high-speed reading and writing of information. The CPU 10 uses the RAM 11 as a working area when processing information. The ROM 12 is a read-only non-volatile storage medium that stores programs such as firmware.

ＨＤＤ１３は、情報の読み書きが可能な不揮発性の記憶媒体であり、ＯＳ（Operating System）や各種の制御プログラム、アプリケーションプログラム等を格納している。通信Ｉ／Ｆ１４は、バス１５と、各種のハードウェアやネットワーク等と、を接続して制御するインターフェースである。 The HDD 13 is a non-volatile storage medium that can read and write information, and stores the OS (Operating System), various control programs, application programs, etc. The communication I/F 14 is an interface that connects and controls the bus 15 with various hardware and networks, etc.

表示装置１６は、ユーザが画像形成装置１の状態を確認するための視覚的ユーザインターフェースである。入力操作部１７は、キーボードやマウス等、ユーザが画像形成装置１に情報を入力するためのユーザインターフェースである。 The display device 16 is a visual user interface that allows the user to check the status of the image forming device 1. The input operation unit 17 is a user interface, such as a keyboard or mouse, that allows the user to input information into the image forming device 1.

画像形成装置１は、上記のハードウェア構成において、ＲＯＭ１２やＨＤＤ１３又は光学ディスク等の記録媒体に格納されたプログラムをＲＡＭ１１に読み出し、ＣＰＵ１０の制御に従って動作させることにより、ソフトウェア制御部を構成する。画像形成装置１は、このようにして構成されたソフトウェア制御部と、ハードウェアと、を組み合わせることによって、画像形成装置１の機能を実現する。 In the image forming device 1, a software control unit is configured by loading a program stored on a recording medium such as ROM 12, HDD 13, or optical disk into RAM 11 and running it under the control of CPU 10 in the hardware configuration described above. The image forming device 1 realizes its functions by combining the software control unit configured in this way with hardware.

＜ＩＰＵ５００のハードウェア構成例＞
図６は、ＩＰＵ５００のハードウェア構成の一例を示すブロック図である。図６に示すように、ＩＰＵ５００は、ＣＰＵ２０と、ＲＡＭ２１と、ＲＯＭ２２と、ＨＤＤ２３と、通信Ｉ／Ｆ２４と、を有する。これらはバス２５を介して相互に電気的に接続している。 <Example of hardware configuration of IPU500>
Fig. 6 is a block diagram showing an example of the hardware configuration of the IPU 500. As shown in Fig. 6, the IPU 500 has a CPU 20, a RAM 21, a ROM 22, a HDD 23, and a communication I/F 24. These are electrically connected to each other via a bus 25.

ＣＰＵ２０は演算手段であり、ＩＰＵ５００全体の動作を制御する。ＲＡＭ２１は、情報の高速な読み書きが可能な揮発性の記憶媒体である。ＣＰＵ２０は、情報を処理する際の作業領域としてＲＡＭ２１を用いる。ＲＯＭ２２は、読み出し専用の不揮発性記憶媒体であり、ファームウェア等のプログラムを格納している。 The CPU 20 is a computing unit that controls the overall operation of the IPU 500. The RAM 21 is a volatile storage medium that allows high-speed reading and writing of information. The CPU 20 uses the RAM 21 as a working area when processing information. The ROM 22 is a read-only non-volatile storage medium that stores programs such as firmware.

ＨＤＤ２３は、情報の読み書きが可能な不揮発性の記憶媒体であり、ＯＳや各種の制御プログラム、アプリケーションプログラム等を格納している。通信Ｉ／Ｆ２４は、バス２５と、各種のハードウェアやネットワーク等と、を接続して制御するインターフェースである。 The HDD 23 is a non-volatile storage medium that can read and write information, and stores the OS, various control programs, application programs, etc. The communication I/F 24 is an interface that connects and controls the bus 25 with various hardware and networks, etc.

ＩＰＵ５００は、上記のハードウェア構成において、ＲＯＭ２２やＨＤＤ２３又は光学ディスク等の記録媒体に格納されたプログラムをＲＡＭ２１に読み出し、ＣＰＵ２０の制御に従って動作させることにより、ソフトウェア制御部を構成する。ＩＰＵ５００は、このようにして構成されたソフトウェア制御部によって、ＩＰＵ５００の機能を実現する機能ブロックを構成する。 In the above hardware configuration, the IPU 500 configures a software control unit by loading a program stored on a recording medium such as the ROM 22, HDD 23, or optical disk into the RAM 21 and running it under the control of the CPU 20. The software control unit thus configured configures the IPU 500 into functional blocks that realize the functions of the IPU 500.

＜ＩＰＵ５００の機能構成例＞
図７は、ＩＰＵ５００の機能構成の一例を示すブロック図である。図７に示すように、ＩＰＵ５００は、入力部３１と、二値化処理部３２と、外形形状抽出部３３と、特徴点抽出部３４と、頂点候補出力部３５と、頂点候補評価部３６と、登録部３７と、認識結果出力部３８と、を有する。ＩＰＵ５００は、これらの機能を、ＣＰＵ２０がＲＯＭ２２に記憶されたプログラムを読み出し、ＲＡＭ２１を作業領域として実行することにより実現する。 <Example of functional configuration of IPU 500>
7 is a block diagram showing an example of the functional configuration of the IPU 500. As shown in FIG. 7, the IPU 500 has an input unit 31, a binarization processing unit 32, a contour shape extraction unit 33, a feature point extraction unit 34, a vertex candidate output unit 35, a vertex candidate evaluation unit 36, a registration unit 37, and a recognition result output unit 38. The IPU 500 realizes these functions by having the CPU 20 read out a program stored in the ROM 22 and execute the program using the RAM 21 as a work area.

入力部３１は、スキャナ１００による原稿Ｐｉの読取画像の少なくとも一部が入力された入力画像Ｉｍ０を、スキャナ１００から取得する。また入力部３１は、入力画像Ｉｍ０のサイズ、解像度、色または背景等の情報を取得する。入力部３１は、取得した入力画像Ｉｍ０と、入力画像Ｉｍ０のサイズ、解像度、色または背景等の情報と、を二値化処理部３２に出力する。 The input unit 31 acquires an input image Im0 from the scanner 100, which is at least a portion of the image of the original document Pi scanned by the scanner 100. The input unit 31 also acquires information about the size, resolution, color, background, etc. of the input image Im0. The input unit 31 outputs the acquired input image Im0 and information about the size, resolution, color, background, etc. of the input image Im0 to the binarization processing unit 32.

二値化処理部３２は、入力画像Ｉｍ０に含まれる複数の画素それぞれの画素値から生成されるヒストグラムに基づく二値化閾値を用いて、入力画像Ｉｍ０の二値化処理を実行し、処理結果である二値化画像Ｉｍ１を外形形状抽出部３３に出力する。 The binarization processing unit 32 performs binarization processing on the input image Im0 using a binarization threshold based on a histogram generated from the pixel values of each of the multiple pixels contained in the input image Im0, and outputs the resulting binarized image Im1 to the outer shape extraction unit 33.

なお、二値化閾値は、ヒストグラムに基づくものに限定されず、入力画像Ｉｍ０の色情報等に基づいて定めてもよい。但し、ヒストグラムに基づいて二値化閾値を定めると、入力画像Ｉｍ０内における図形以外の画像の影響を抑えられるため、より好適である。また、二値化処理を高速に行うために、二値化処理の前処理として、入力画像Ｉｍ０がカラー画像である場合にはモノクロ画像変換や、画像の解像度を低下させる変換等を行ってもよい。 The binarization threshold does not have to be based on a histogram, but may be determined based on color information of the input image Im0, etc. However, determining the binarization threshold based on a histogram is more preferable, as it reduces the influence of images other than figures in the input image Im0. Furthermore, in order to perform the binarization process quickly, preprocessing of the binarization process may involve monochrome image conversion if the input image Im0 is a color image, or conversion that reduces the image resolution.

外形形状抽出部３３は、入力画像Ｉｍ０に基づく二値化画像Ｉｍ１に含まれる図形の外形形状を抽出し、抽出結果である外形形状画像Ｉｍ２を特徴点抽出部３４に出力する。例えば、外形形状抽出部３３は、入力画像Ｉｍ０に含まれる複数の画素のうち、黒色に着色した画素等の着色画素を検出し、該着色画素により構成される図形の外形形状を抽出する。 The outer shape extraction unit 33 extracts the outer shape of a figure contained in a binarized image Im1 based on the input image Im0, and outputs the resulting outer shape image Im2 to the feature point extraction unit 34. For example, the outer shape extraction unit 33 detects colored pixels, such as pixels colored black, from among the multiple pixels contained in the input image Im0, and extracts the outer shape of a figure composed of these colored pixels.

特徴点抽出部３４は、外形形状抽出部３３により抽出された図形の外形形状における特徴点を抽出し、抽出結果を頂点候補出力部３５に出力する。特徴点は、外形形状抽出部３３により抽出された図形における特徴的な部位を意味する。特徴的な部位は、認識したい図形に応じて予め定義しておくことができる。本実施形態では、例えば外形形状における角部を特徴点として抽出する。角部は、２つの直線が交差する部位を意味する。また本実施形態では、抽出結果は、入力画像Ｉｍ０内での特徴点の位置情報である。例えば外形形状を直線近似等により特徴点抽出処理を行うと、処理を簡略化して処理時間を短縮できるため好適である。 The feature point extraction unit 34 extracts feature points in the outer shape of the figure extracted by the outer shape extraction unit 33, and outputs the extraction results to the vertex candidate output unit 35. Feature points refer to characteristic parts of the figure extracted by the outer shape extraction unit 33. Characteristic parts can be defined in advance depending on the figure to be recognized. In this embodiment, for example, corners in the outer shape are extracted as feature points. A corner refers to a part where two straight lines intersect. In this embodiment, the extraction result is position information of the feature points within the input image Im0. For example, performing feature point extraction processing by linear approximation of the outer shape is preferable, as it simplifies the processing and shortens the processing time.

頂点候補出力部３５は、特徴点抽出部３４により抽出された特徴点に基づき、チェックボックス５１１の頂点候補の検出結果を出力する。頂点候補は、チェックボックス５１１の頂点である可能性のある部位の位置を意味する。頂点候補の検出結果は、例えば入力画像Ｉｍ０内での頂点候補の位置である。本実施形態では、頂点候補出力部３５は、頂点候補情報を含む頂点候補画像Ｉｍ３を頂点候補の検出結果として頂点候補評価部３６に出力する。 The vertex candidate output unit 35 outputs the detection result of the vertex candidate of the check box 511 based on the feature points extracted by the feature point extraction unit 34. A vertex candidate means the position of a part that may be the vertex of the check box 511. The detection result of the vertex candidate is, for example, the position of the vertex candidate within the input image Im0. In this embodiment, the vertex candidate output unit 35 outputs a vertex candidate image Im3 containing vertex candidate information to the vertex candidate evaluation unit 36 as the detection result of the vertex candidate.

本実施形態では、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる複数の画素それぞれの画素値から生成されるヒストグラムに基づく二値化閾値を用いて、入力画像Ｉｍ０を二値化した二値化画像Ｉｍ１から抽出される特徴点に基づき、チェックボックス５１１の頂点候補を出力する。 In this embodiment, the vertex candidate output unit 35 outputs vertex candidates for the check box 511 based on feature points extracted from the binarized image Im1 obtained by binarizing the input image Im0 using a binarization threshold based on a histogram generated from the pixel values of each of the multiple pixels included in the input image Im0.

また、本実施形態では、頂点候補出力部３５は、特徴点抽出部３４により抽出された複数の特徴点同士を結ぶ画像領域に含まれる複数の画素のうち、画素値が該特徴点に含まれる画素の画素値と一致する画素数に基づき、チェックボックス５１１の頂点候補の検出結果を出力する。 In addition, in this embodiment, the vertex candidate output unit 35 outputs the detection result of the vertex candidates of the check box 511 based on the number of pixels included in the image area connecting the multiple feature points extracted by the feature point extraction unit 34, whose pixel values match the pixel values of the pixels included in the feature points.

頂点候補出力部３５は、入力画像Ｉｍ０の色成分情報およびヒストグラム情報の少なくとも一方に基づき、入力画像Ｉｍ０における手書き領域と手書き領域以外の領域とに区別された入力画像Ｉｍ０の特徴点に基づき、チェックボックス５１１の頂点候補を出力するように構成されてもよい。ここで、手書き領域とは、画像形成装置１のユーザが手書きにより書き込んだ領域を含む原稿Ｐｉをスキャナ１００により読み取った場合に、入力画像Ｉｍ０に含まれる該手書きにより書き込んだ領域に対応する領域をいう。 The vertex candidate output unit 35 may be configured to output vertex candidates for the check box 511 based on feature points of the input image Im0, which are distinguished into handwritten and non-handwritten areas, based on at least one of the color component information and histogram information of the input image Im0. Here, a handwritten area refers to an area in the input image Im0 that corresponds to the handwritten area when a document Pi containing an area handwritten by a user of the image forming device 1 is read by the scanner 100.

手書き領域は、ユーザが手書きにより原稿Ｐｉに書き込んだパターンまたはマークに基づくため、原稿Ｐｉに画像形成装置等によって形成された画像と比較して、画像の濃度が一定ではない。従って、頂点候補出力部３５は、例えば、入力画像Ｉｍ０に含まれる画像の濃度の変動幅が所定の変動幅閾値よりも大きい領域を手書き領域とし、変動幅閾値以下の領域を手書き領域以外の領域として区別できる。 Since handwritten regions are based on patterns or marks handwritten by a user on the document Pi, the image density is not constant compared to images formed on the document Pi by an image forming device or the like. Therefore, the vertex candidate output unit 35 can distinguish, for example, regions in the input image Im0 where the density fluctuation range of the image is greater than a predetermined fluctuation range threshold as handwritten regions, and regions below the fluctuation range threshold as non-handwritten regions.

また、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる罫線領域から検出される入力画像Ｉｍ０の傾きが補正された入力画像Ｉｍ０の特徴点に基づき、チェックボックス５１１の頂点候補を出力するように構成されてもよい。ここで、罫線領域とは、罫線が含まれる原稿Ｐｉをスキャナ１００により読み取った場合に、入力画像Ｉｍ０に含まれる該罫線に対応する領域をいう。 The vertex candidate output unit 35 may also be configured to output vertex candidates for the check box 511 based on feature points of the input image Im0, which have been corrected for tilt and are detected from the ruled line area included in the input image Im0. Here, the ruled line area refers to the area that corresponds to the ruled line included in the input image Im0 when the original Pi containing the ruled line is read by the scanner 100.

また、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる複数の画素それぞれの画素値が色反転された入力画像Ｉｍ０の特徴点に基づき、チェックボックス５１１の頂点候補を出力するように構成されてもよい。色反転とは、所定の色を補色関係にある色に変換することをいう。例えば色反転は、白黒画像における白色の画像領域を黒色に変換し、また黒色の画像領域を白色に変換する白黒反転等である。 The vertex candidate output unit 35 may also be configured to output vertex candidates for the check box 511 based on feature points of the input image Im0, in which the pixel values of each of the multiple pixels contained in the input image Im0 are color-inverted. Color inversion refers to converting a specific color into a color that has a complementary relationship with the input image Im0. For example, color inversion is black-and-white inversion, in which white image areas in a black-and-white image are converted into black, and black image areas are converted into white.

また、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる複数の画素それぞれの画素値を色成分に分解して取得される色成分画像における特徴点に基づき、チェックボックス５１１の頂点候補を出力するように構成されてもよい。ここで、色成分画像とは、複数の原色を用いたカラーの入力画像Ｉｍ０を、原色ごとに色分解した画像をいう。例えば入力画像Ｉｍ０が赤（Ｒ）、緑（Ｇ）、青（Ｂ）を３原色としたカラー画像である場合には、色成分画像は、赤のみからなる画像、緑のみからなる画像および青のみからなる画像に対応する。 The vertex candidate output unit 35 may also be configured to output vertex candidates for the check box 511 based on feature points in a color component image obtained by decomposing the pixel values of each of the multiple pixels included in the input image Im0 into color components. Here, a color component image refers to an image obtained by color-separating the color input image Im0, which uses multiple primary colors, into each primary color. For example, if the input image Im0 is a color image with red (R), green (G), and blue (B) as the three primary colors, the color component images correspond to an image consisting of only red, an image consisting of only green, and an image consisting of only blue.

また、頂点候補出力部３５は、入力画像Ｉｍ０の色成分情報およびヒストグラム情報の少なくとも一方に基づき、入力画像Ｉｍ０における帳票領域と帳票領域以外の領域とに区別された入力画像Ｉｍ０の特徴点に基づき、チェックボックス５１１の頂点候補を出力するように構成されてもよい。ここで、帳票領域とは、帳票が含まれる原稿Ｐｉをスキャナ１００により読み取った場合に、入力画像Ｉｍ０に含まれる該帳票に対応する領域をいう。帳票とは、記入するための空欄を設けた書類をいい、帳簿や伝票等の総称である。 The vertex candidate output unit 35 may also be configured to output vertex candidates for the check box 511 based on feature points of the input image Im0, which are separated into form areas and non-form areas in the input image Im0, based on at least one of the color component information and histogram information of the input image Im0. Here, the form area refers to the area corresponding to the form included in the input image Im0 when a document Pi including the form is read by the scanner 100. A form is a document with blank spaces for filling in information, and is a general term for ledgers, slips, etc.

また、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる黒色領域が拡大された入力画像Ｉｍ０の特徴点に基づき、チェックボックス５１１の頂点候補を出力するように構成されてもよい。黒色領域は、画素値が黒色を示す画像領域を意味する。 The vertex candidate output unit 35 may also be configured to output vertex candidates for the check box 511 based on feature points of the input image Im0, which is an enlarged version of a black area contained in the input image Im0. A black area refers to an image area where pixel values indicate black.

頂点候補評価部３６は、頂点候補出力部３５から受け取った頂点候補画像Ｉｍ３に含まれる頂点候補情報を評価することにより、頂点候補がチェックボックス５１１の頂点であるか否かを判定する。 The vertex candidate evaluation unit 36 evaluates the vertex candidate information contained in the vertex candidate image Im3 received from the vertex candidate output unit 35 to determine whether the vertex candidate is the vertex of the check box 511.

チェックボックス５１１は、四角の図形であるため、頂点候補評価部３６は、頂点候補が四角の図形の四隅の頂点に該当するか否かを評価することにより、頂点候補がチェックボックス５１１の頂点であるか否かを判定できる。換言すると、頂点候補評価部３６は、チェックボックス５１１の大きさ情報および位置情報の少なくとも一方に基づき、頂点候補が四角の図形の四隅の頂点に該当するか否かを判定できる。 Because the check box 511 is a rectangular shape, the vertex candidate evaluation unit 36 can determine whether a vertex candidate is a vertex of the check box 511 by evaluating whether the vertex candidate corresponds to one of the four corner vertices of the rectangular shape. In other words, the vertex candidate evaluation unit 36 can determine whether a vertex candidate corresponds to one of the four corner vertices of the rectangular shape based on at least one of the size information and position information of the check box 511.

頂点候補評価部３６による評価基準は、例えば以下のものが挙げられる。
（１）複数の頂点候補同士を結ぶ直線が着色していること。
（２）頂点候補が矩形または正方形の一部を構成していること。
（３）頂点候補同士の間の距離が、入力画像Ｉｍ０に含まれる文字のサイズと比較して適当なサイズであること。 The evaluation criteria used by the vertex candidate evaluation unit 36 include, for example, the following:
(1) The lines connecting multiple vertex candidates are colored.
(2) The vertex candidates form part of a rectangle or square.
(3) The distance between the vertex candidates is an appropriate size compared to the size of the characters contained in the input image Im0.

頂点候補評価部３６は、入力画像Ｉｍ０から検出された頂点候補が２つまたは３つである場合には、検出された頂点候補に基づきは、４つの頂点候補のうちの他の頂点候補を演算により推定し、この推定結果を頂点候補として用いて頂点候補を評価する。 When two or three vertex candidates are detected from the input image Im0, the vertex candidate evaluation unit 36 performs calculations to estimate the other vertex candidates out of the four vertex candidates based on the detected vertex candidates, and evaluates the vertex candidates using these estimation results as the vertex candidates.

登録部３７は、頂点候補評価部３６から評価結果を受け取り、頂点候補がチェックボックス５１１の頂点であった場合には、頂点があることに対応する情報と、この頂点の位置情報と、を登録する。チェックボックス５１１の頂点がなかった場合には、登録部３７は、頂点がないことに対応する情報を登録する。その後、登録部３７は、頂点の位置情報を認識結果出力部３８に出力する。 The registration unit 37 receives the evaluation result from the vertex candidate evaluation unit 36, and if the vertex candidate is the vertex of the check box 511, it registers information corresponding to the presence of a vertex and the position information of this vertex. If there is no vertex of the check box 511, the registration unit 37 registers information corresponding to the absence of a vertex. The registration unit 37 then outputs the position information of the vertex to the recognition result output unit 38.

認識結果出力部３８は、登録部３７から受け取ったチェックボックス５１１の認識結果を出力する。換言すると、認識結果出力部３８は、頂点候補出力部３５から出力される頂点候補の検出結果に基づき、チェックボックス５１１の認識結果を出力する。本実施形態では、認識結果出力部３８は、チェックボックス５１１の認識結果として、頂点の有無情報と、頂点がある場合にはチェックボックス５１１の位置情報と、を出力する。チェックボックス５１１は矩形の外形形状を有するため、チェックボックス５１１の認識結果は、チェックボックス５１１における矩形の少なくとも１つの頂点位置を含んでいる。 The recognition result output unit 38 outputs the recognition result of the check box 511 received from the registration unit 37. In other words, the recognition result output unit 38 outputs the recognition result of the check box 511 based on the vertex candidate detection result output from the vertex candidate output unit 35. In this embodiment, the recognition result output unit 38 outputs information on the presence or absence of a vertex as the recognition result of the check box 511, and, if a vertex is present, the position information of the check box 511. Because the check box 511 has a rectangular outer shape, the recognition result of the check box 511 includes the position of at least one vertex of the rectangle in the check box 511.

認識結果出力部３８は、例えばチェックボックス５１１の頂点位置情報をＩＰＵ５００の外部装置であるコントローラ２００に出力する。但し、外部装置はコントローラ２００に限定されず、プロッタ３００や表示装置１６等の他の外部装置であってもよい。 The recognition result output unit 38 outputs, for example, the vertex position information of the check box 511 to the controller 200, which is an external device of the IPU 500. However, the external device is not limited to the controller 200, and may be other external devices such as the plotter 300 or the display device 16.

＜画像形成装置１による原稿読取動作例＞
図８は、画像形成装置１による原稿読取動作の一例を示すフローチャートである。画像形成装置１は、入力操作部１７を用いたユーザによる原稿読取の開始指示を受け付けた際に図８の動作を開始する。 <Example of document reading operation by image forming apparatus 1>
8 is a flowchart showing an example of an original reading operation by the image forming apparatus 1. The image forming apparatus 1 starts the operation of FIG. 8 when it receives an instruction to start original reading from a user using the input operation unit 17.

まず、ステップＳ８１において、画像形成装置１は、スキャナ１００により、ＡＤＦ４１またはスキャナユニット４２に載置された原稿Ｐｉを読み取る。 First, in step S81, the image forming device 1 uses the scanner 100 to read the original document Pi placed on the ADF 41 or scanner unit 42.

続いて、ステップＳ８２において、画像形成装置１は、コントローラ２００により、スキャナ１００によって読み取られた原稿Ｐｉの枚数をカウントする。 Next, in step S82, the image forming device 1, via the controller 200, counts the number of original documents Pi read by the scanner 100.

続いて、ステップＳ８３において、画像形成装置１は、コントローラ２００により、スキャナ１００が全ての原稿Ｐｉを読み取ったか否かを判定する。例えば、コントローラ２００は、ＡＤＦ４１またはスキャナユニット４２に原稿Ｐｉが載置されているか否か、あるいは読み取った原稿Ｐｉの枚数が、入力操作部１７を用いて入力された原稿Ｐｉの読取指示枚数に到達したか否か、を判定することにより、全ての原稿Ｐｉを読み取ったか否かを判定できる。 Next, in step S83, the image forming apparatus 1 determines, via the controller 200, whether the scanner 100 has read all of the original documents Pi. For example, the controller 200 can determine whether all of the original documents Pi have been read by determining whether original documents Pi are placed on the ADF 41 or the scanner unit 42, or whether the number of original documents Pi that have been read has reached the specified number of original documents Pi to be read that was input using the input operation unit 17.

ステップＳ８３において、全ての原稿Ｐｉを読み取っていないと判定された場合には（ステップＳ８３、Ｎｏ）、画像形成装置１は、ステップＳ８１以降の動作を再度行う。一方、全ての原稿Ｐｉを読み取ったと判定された場合には（ステップＳ８３、Ｙｅｓ）、ステップＳ８４において、画像形成装置１は、表示装置１６に原稿Ｐｉの情報を表示する。この原稿Ｐｉの情報は、例えば読み取った原稿Ｐｉの枚数または原稿Ｐｉのサイズ等の情報である。 If it is determined in step S83 that not all of the original documents Pi have been read (step S83, No), the image forming device 1 performs the operations from step S81 onwards again. On the other hand, if it is determined that all of the original documents Pi have been read (step S83, Yes), in step S84, the image forming device 1 displays information about the original documents Pi on the display device 16. This information about the original documents Pi is, for example, information such as the number of pages of the original documents Pi that have been read or the size of the original documents Pi.

続いて、ステップＳ８５において、画像形成装置１は、コントローラ２００により、読み取った原稿Ｐｉにおけるチェックボックス５１１を認識するか否かを判定する。この判定は、例えば入力操作部１７を用いたユーザ操作等に基づいて行うことができる。 Next, in step S85, the image forming device 1 determines, via the controller 200, whether or not to recognize the check box 511 on the scanned document Pi. This determination can be made based on, for example, a user operation using the input operation unit 17.

ステップＳ８５において、チェックボックス５１１を認識しないと判定された場合には（ステップＳ８５、Ｎｏ）、画像形成装置１は、動作を終了する。一方、チェックボックス５１１を認識すると判定された場合には（ステップＳ８５、Ｙｅｓ）、ステップＳ８６において、画像形成装置１は、チェックボックス５１１の認識処理を実行する。この認識処理の詳細は、図９を参照して次述する。画像形成装置１は、チェックボックス５１１の認識処理を終了後に動作を終了する。 If it is determined in step S85 that the check box 511 is not recognized (step S85, No), the image forming device 1 terminates operation. On the other hand, if it is determined that the check box 511 is recognized (step S85, Yes), in step S86, the image forming device 1 executes recognition processing for the check box 511. Details of this recognition processing will be described below with reference to Figure 9. The image forming device 1 terminates operation after completing recognition processing for the check box 511.

以上のようにして、画像形成装置１は、原稿Ｐｉを読み取ることができる。 In this way, the image forming device 1 can read the original document Pi.

＜ＩＰＵ５００による認識処理例＞
図９は、ＩＰＵ５００によるチェックボックス５１１の認識処理の一例を示すフローチャートである。ＩＰＵ５００は、図８の動作におけるステップ８６のタイミングにおいて、図９の処理を開始する。 <Example of recognition processing by IPU 500>
9 is a flowchart showing an example of a process for recognizing the check box 511 by the IPU 500. The IPU 500 starts the process of FIG. 9 at the timing of step 86 in the operation of FIG.

まず、ステップＳ９１において、ＩＰＵ５００は、入力部３１により、スキャナ１００から入力画像Ｉｍ０を入力することにより取得する。また入力部３１は、入力画像Ｉｍ０のサイズ、解像度、色または背景等の情報を取得する。入力部３１は、取得した入力画像Ｉｍ０、並びに入力画像Ｉｍ０のサイズ、解像度、色または背景等の情報を二値化処理部３２に出力する。 First, in step S91, the IPU 500 acquires the input image Im0 by inputting it from the scanner 100 via the input unit 31. The input unit 31 also acquires information about the size, resolution, color, or background of the input image Im0. The input unit 31 outputs the acquired input image Im0, as well as information about the size, resolution, color, or background of the input image Im0, to the binarization processing unit 32.

続いて、ステップＳ９２において、ＩＰＵ５００は、二値化処理部３２により、入力画像Ｉｍ０に含まれる複数の画素それぞれの画素値から生成されるヒストグラムに基づく二値化閾値を用いて、入力画像Ｉｍ０の二値化処理を実行し、処理結果である二値化画像Ｉｍ１を外形形状抽出部３３に出力する。 Next, in step S92, the IPU 500 causes the binarization processing unit 32 to perform binarization processing of the input image Im0 using a binarization threshold based on a histogram generated from the pixel values of each of the multiple pixels contained in the input image Im0, and outputs the processed result, the binarized image Im1, to the outer shape extraction unit 33.

続いて、ステップＳ９３において、ＩＰＵ５００は、外形形状抽出部３３により、入力画像Ｉｍ０に基づく二値化画像Ｉｍ１に含まれる図形の外形形状を抽出し、抽出結果である外形形状画像Ｉｍ２を特徴点抽出部３４に出力する。複数の図形が二値化画像Ｉｍ１に含まれる場合には、外形形状抽出部３３は、全ての図形の外形形状の抽出結果を含む外形形状画像Ｉｍ２を特徴点抽出部３４に出力する。 Next, in step S93, the IPU 500 causes the outer shape extraction unit 33 to extract the outer shapes of the figures contained in the binarized image Im1 based on the input image Im0, and outputs the extracted outer shape image Im2 to the feature point extraction unit 34. If multiple figures are contained in the binarized image Im1, the outer shape extraction unit 33 outputs the outer shape image Im2 containing the extracted outer shapes of all of the figures to the feature point extraction unit 34.

続いて、ステップＳ９４において、ＩＰＵ５００は、特徴点抽出部３４により、外形形状抽出部３３により抽出された図形の外形形状における特徴点を抽出し、抽出結果を頂点候補出力部３５に出力する。図形の外形形状に複数の特徴点が含まれる場合には、特徴点抽出部３４は、全ての特徴点の抽出結果を頂点候補出力部３５に出力する。また複数の図形の外形形状が外形形状画像Ｉｍ２に含まれる場合には、特徴点抽出部３４は、全ての図形の外形形状における全ての特徴点の抽出結果を頂点候補出力部３５に出力する。 Next, in step S94, the IPU 500 causes the feature point extraction unit 34 to extract feature points in the outer shape of the figure extracted by the outer shape extraction unit 33, and outputs the extraction results to the vertex candidate output unit 35. If the outer shape of the figure includes multiple feature points, the feature point extraction unit 34 outputs the extraction results of all feature points to the vertex candidate output unit 35. Furthermore, if the outer shapes of multiple figures are included in the outer shape image Im2, the feature point extraction unit 34 outputs the extraction results of all feature points in the outer shapes of all figures to the vertex candidate output unit 35.

続いて、ステップＳ９５において、ＩＰＵ５００は、頂点候補出力部３５により、特徴点抽出部３４により抽出された特徴点に基づき、チェックボックス５１１の頂点候補の検出結果を出力する。例えば、頂点候補出力部３５は、頂点候補情報を含む頂点候補画像Ｉｍ３を頂点候補の検出結果として頂点候補評価部３６に出力する。頂点候補が複数含まれる場合には、頂点候補出力部３５は、全ての頂点候補の検出結果を含む頂点候補画像Ｉｍ３を頂点候補評価部３６に出力する。 Next, in step S95, the IPU 500 causes the vertex candidate output unit 35 to output the detection results of the vertex candidates for the check box 511 based on the feature points extracted by the feature point extraction unit 34. For example, the vertex candidate output unit 35 outputs a vertex candidate image Im3 containing vertex candidate information to the vertex candidate evaluation unit 36 as the vertex candidate detection result. If multiple vertex candidates are included, the vertex candidate output unit 35 outputs a vertex candidate image Im3 containing the detection results of all vertex candidates to the vertex candidate evaluation unit 36.

続いて、ステップＳ９６において、ＩＰＵ５００は、頂点候補評価部３６により、頂点候補出力部３５から受け取った頂点候補画像Ｉｍ３に含まれる頂点候補情報を評価することにより、頂点候補がチェックボックス５１１の頂点であるか否かを判定する。頂点候補が複数ある場合には、頂点候補評価部３６は、全ての頂点候補がチェックボックス５１１の頂点であるか否かを判定する。 Next, in step S96, the IPU 500 determines whether the vertex candidate is the vertex of the check box 511 by using the vertex candidate evaluation unit 36 to evaluate the vertex candidate information contained in the vertex candidate image Im3 received from the vertex candidate output unit 35. If there are multiple vertex candidates, the vertex candidate evaluation unit 36 determines whether all of the vertex candidates are the vertices of the check box 511.

続いて、ステップＳ９７において、ＩＰＵ５００は、登録部３７により、頂点候補評価部３６から評価結果を受け取り、頂点候補がチェックボックス５１１の頂点であった場合には、この頂点の位置情報を登録する。登録部３７は、全ての頂点の位置情報を認識結果出力部３８に出力する。 Next, in step S97, the IPU 500 receives the evaluation result from the vertex candidate evaluation unit 36 via the registration unit 37, and if the vertex candidate is the vertex of the check box 511, registers the position information of this vertex. The registration unit 37 outputs the position information of all vertices to the recognition result output unit 38.

続いて、ステップＳ９８において、ＩＰＵ５００は、認識結果出力部３８により、登録部３７から受け取ったチェックボックス５１１の認識結果を出力する。 Next, in step S98, the IPU 500 outputs the recognition result of the check box 511 received from the registration unit 37 via the recognition result output unit 38.

以上のようにして、ＩＰＵ５００は、入力画像Ｉｍ０に含まれるチェックボックス５１１の認識処理を実行することができる。 In this way, the IPU 500 can perform recognition processing of the check box 511 contained in the input image Im0.

＜ＩＰＵ５００による認識処理結果例＞
次に、ＩＰＵ５００による認識処理結果の一例を説明する。 <Example of recognition processing result by IPU 500>
Next, an example of the result of the recognition process by the IPU 500 will be described.

（入力画像Ｉｍ０の一例）
まず、図１０は、入力画像Ｉｍ０の一例を示す図である。図１０に示す入力画像Ｉｍ０は、図１に示した入力画像Ｉｍ０におけるさらに小さい領域であるチェックボックス５１１ａおよび５１１ｂの周辺領域を、取り出して表示している。 (Example of input image Im0)
10 is a diagram showing an example of input image Im0. Input image Im0 shown in Fig. 10 shows an extracted area surrounding check boxes 511a and 511b, which is an even smaller area in input image Im0 shown in Fig. 1.

（二値化画像Ｉｍ１の一例）
図１１は、入力画像Ｉｍ０のヒストグラムを例示する図である。図１１において、横軸は、入力画像Ｉｍ０に含まれる画素の画素値を表し、縦軸は度数を表している。二値化処理部３２は、図１１のようなヒストグラムに基づき、入力画像Ｉｍ０に含まれる図形を抽出するために適切な二値化閾値を定めることができる。二値化処理部３２は、二値化処理により、図１２に示すような二値化画像Ｉｍ１を得ることができる。 (Example of binarized image Im1)
FIG. 11 is a diagram illustrating a histogram of input image Im0. In FIG. 11, the horizontal axis represents the pixel values of pixels included in input image Im0, and the vertical axis represents frequency. Based on the histogram shown in FIG. 11, the binarization processing unit 32 can determine an appropriate binarization threshold value for extracting shapes included in input image Im0. Through the binarization processing, the binarization processing unit 32 can obtain a binarized image Im1 as shown in FIG. 12.

（外形形状画像Ｉｍ２の一例）
図１３は、外形形状画像Ｉｍ２の一例を示す図である。図１３に示すように、外形形状抽出部３３は、入力画像Ｉｍ０に含まれる図形の外形形状５１２を抽出する。 (Example of outer shape image Im2)
13 is a diagram showing an example of the outer shape image Im2. As shown in Fig. 13, the outer shape extractor 33 extracts the outer shape 512 of the figure included in the input image Im0.

（頂点候補５１４の一例）
図１４から図１６は、頂点候補５１４の検出処理の一例を説明する図であり、図１４は第１図、図１５は第２図、図１６は第３図である。 (An example of a vertex candidate 514)
14 to 16 are diagrams for explaining an example of the process of detecting the vertex candidates 514, with FIG. 14 being FIG. 1, FIG. 15 being FIG. 2, and FIG. 16 being FIG. 3.

図１４に示すように、特徴点抽出部３４は、外形形状画像Ｉｍ２に含まれる外形形状５１２に基づき、特徴点５１３を抽出する。 As shown in Figure 14, the feature point extraction unit 34 extracts feature points 513 based on the outer shape 512 contained in the outer shape image Im2.

図１５は、外形形状画像Ｉｍ２におけるチェックボックス５１１ａ周辺の領域を取り出して表示している。頂点候補出力部３５は、チェックボックス５１１ａの四隅のうちの２つの角部である頂点候補５１４ａおよび５１４ｂを検出する。 Figure 15 shows the area around the check box 511a in the outer shape image Im2. The vertex candidate output unit 35 detects vertex candidates 514a and 514b, which are two of the four corners of the check box 511a.

図１６は、図１５のチェックボックス５１１ａを図１５よりもさらに拡大して表示した図である。図１６に示すように、頂点候補５１４ａおよび５１４ｂの位置は、チェックボックス５１１ａにおける黒色の画像領域の外枠部分に対応する。このため、頂点候補出力部３５は、頂点候補５１４ａ近傍の黒色の画像領域の外側位置５１４ａ'と内側位置５１４ａ''を参照して補正した頂点候補５１４ａの位置情報を出力する。また、頂点候補出力部３５は、頂点候補５１４ｂ近傍の黒色の画像領域の外側位置５１４ｂ'と内側位置５１４ｂ''を参照して補正した頂点候補５１４ｂの位置情報を出力する。 Figure 16 is a diagram showing check box 511a in Figure 15 at an even larger scale than Figure 15. As shown in Figure 16, the positions of vertex candidates 514a and 514b correspond to the outer frame portion of the black image area in check box 511a. Therefore, the vertex candidate output unit 35 outputs position information of vertex candidate 514a corrected by referring to outer position 514a' and inner position 514a'' of the black image area near vertex candidate 514a. In addition, the vertex candidate output unit 35 outputs position information of vertex candidate 514b corrected by referring to outer position 514b' and inner position 514b'' of the black image area near vertex candidate 514b.

図１７は、頂点候補５１４の評価処理の一例を説明する図である。 Figure 17 is a diagram illustrating an example of the evaluation process for vertex candidates 514.

頂点候補評価部３６は、検出した頂点候補５１４ａと頂点候補５１４ｂとを結ぶ画像領域に含まれる複数の画素のうち、画素値が特徴点５１３に含まれる黒色に対応する画素値と一致する画素数に基づき、頂点候補５１４ａおよび５１４ｂを評価する。頂点候補５１４ａと頂点候補５１４ｂとを結ぶ画像領域に含まれる画素の画素値が黒色に対応する画素の画素数が所定の画素数閾値よりも多い場合に、頂点候補評価部３６は、頂点候補５１４ａおよび５１４ｂをそれぞれ頂点候補とする。 The vertex candidate evaluation unit 36 evaluates the vertex candidates 514a and 514b based on the number of pixels included in the image region connecting the detected vertex candidates 514a and 514b whose pixel values match the pixel values corresponding to the color black included in the feature point 513. If the number of pixels included in the image region connecting the vertex candidates 514a and 514b whose pixel values correspond to black is greater than a predetermined pixel number threshold, the vertex candidate evaluation unit 36 designates the vertex candidates 514a and 514b as vertex candidates, respectively.

図１７において、例えば誤検出頂点候補５１５が検出された場合には、誤検出頂点候補５１５と頂点候補５１４ａとを結ぶ画像領域に含まれる複数の画素の多くは、白色に対応する画素値となるため、黒色に対応する画素の画素数が所定の画素数閾値以下になる。このため、頂点候補評価部３６は、誤検出頂点候補５１５は頂点候補として適切ではないと評価し、誤検出頂点候補５１５を頂点候補から除去できる。 In Figure 17, for example, if a falsely detected vertex candidate 515 is detected, many of the pixels included in the image area connecting the falsely detected vertex candidate 515 and vertex candidate 514a will have pixel values corresponding to white, and the number of pixels corresponding to black will be below a predetermined pixel number threshold. Therefore, the vertex candidate evaluation unit 36 evaluates the falsely detected vertex candidate 515 as being inappropriate as a vertex candidate, and can remove the falsely detected vertex candidate 515 from the vertex candidates.

（出力画像の一例）
図１８および図１９は、出力画像を示す図である。図１８は第１例に係る出力画像Ｑを示し、図１９は第２例に係る出力画像Ｑ'を示している。 (Example of output image)
18 and 19 are diagrams showing output images: Fig. 18 shows an output image Q according to the first example, and Fig. 19 shows an output image Q' according to the second example.

図１８では、出力画像Ｑは、チェックボックス５１１を指示する認識マーク５２１を含んでいる。一方、図１９では、出力画像Ｑ'は、認識マーク５２１に加え、対象項目５１０を指示する項目マーク５２３をさらに含んでいる。 In Figure 18, output image Q includes a recognition mark 521 indicating the check box 511. On the other hand, in Figure 19, output image Q' includes, in addition to the recognition mark 521, an item mark 523 indicating the target item 510.

（傾き検出処理例）
図２０は、頂点候補出力部３５による傾き検出処理の一例を説明する図である。図２０は、入力画像Ｉｍ０が二値化処理された二値化画像Ｉｍ１を示している。 (Tilt detection processing example)
Fig. 20 is a diagram illustrating an example of the inclination detection process performed by the vertex candidate output unit 35. Fig. 20 shows a binarized image Im1 obtained by binarizing the input image Im0.

ここで、原稿Ｐｉが傾いた状態においてスキャナ１００により原稿Ｐｉが読み取られると、入力画像Ｉｍ０および二値化画像Ｉｍ１が傾き、ＩＰＵ５００は、チェックボックス５１１の頂点候補５１４を検出できない場合がある。 Here, if the scanner 100 reads the original document Pi while it is tilted, the input image Im0 and the binarized image Im1 will be tilted, and the IPU 500 may not be able to detect the vertex candidate 514 of the check box 511.

図２０に示すように、原稿Ｐｉに含まれる罫線に対応する罫線領域５１６は、原稿Ｐｉの外形形状に沿って設けられている。このため、頂点候補出力部３５は、罫線領域５１６の傾きに基づいて入力画像Ｉｍ０および二値化画像Ｉｍ１の傾きを補正し、補正後の二値化画像Ｉｍ１から頂点候補５１４を検出する。これにより、ＩＰＵ５００は、チェックボックス５１１の頂点候補５１４の検出精度を高く確保可能になる。 As shown in Figure 20, the ruled line area 516 corresponding to the ruled lines included in the document Pi is arranged along the outline of the document Pi. Therefore, the vertex candidate output unit 35 corrects the inclination of the input image Im0 and the binary image Im1 based on the inclination of the ruled line area 516, and detects the vertex candidates 514 from the corrected binary image Im1. This allows the IPU 500 to ensure high accuracy in detecting the vertex candidates 514 of the check boxes 511.

（色反転処理例）
図２１は、色反転処理の一例を説明する図である。図２１は、入力画像Ｉｍ０が二値化処理された二値化画像Ｉｍ１に対して白黒反転した色反転画像Ｉｍ１'を示している。 (Example of color inversion processing)
Fig. 21 is a diagram illustrating an example of color inversion processing, showing a color-inverted image Im1' obtained by inverting black and white from a binarized image Im1 obtained by binarizing an input image Im0.

頂点候補出力部３５は、原稿Ｐｉが色反転した画像が形成されたものや、画像の下地が着色されたものである場合においても、色反転画像Ｉｍ１'を用いて頂点候補５１４を検出できる。これにより、ＩＰＵ５００は、チェックボックス５１１の頂点候補５１４の検出精度を高く確保できる。 The vertex candidate output unit 35 can detect the vertex candidates 514 using the color-inverted image Im1' even if the document Pi contains a color-inverted image or the background of the image is colored. This allows the IPU 500 to ensure high accuracy in detecting the vertex candidates 514 of the check box 511.

（黒色領域の拡大処理例）
図２２は、黒色領域の拡大処理の一例を説明する図である。二値化画像Ｉｍ１は、入力画像Ｉｍ０が二値化された画像であり、黒色領域の拡大処理が施される前の状態を示している。黒色領域拡大画像Ｉｍ１''は、二値化画像Ｉｍ１におけるチェックボックス５１１の枠図形の枠の太さを太くすることにより、黒色領域の面積を増加させた画像である。 (Example of enlarging the black area)
22 is a diagram illustrating an example of the enlargement process of a black region. The binarized image Im1 is an image obtained by binarizing the input image Im0, and shows the state before the enlargement process of the black region is performed. The enlarged black region image Im1'' is an image in which the area of the black region is increased by thickening the frame of the frame figure of the check box 511 in the binarized image Im1.

例えば、原稿Ｐｉに形成された画像の劣化等により、チェックボックス５１１の一部が欠けている原稿Ｐｉである場合には、チェックボックス５１１の認識精度が低下する場合がある。黒色領域の拡大処理を施すことにより、チェックボックス５１１の欠けた部分を埋めることができるため、ＩＰＵ５００は、チェックボックス５１１の頂点候補５１４の検出精度を高く確保できる。 For example, if the document Pi has a check box 511 that is missing a portion due to degradation of the image formed on the document Pi, the accuracy of recognizing the check box 511 may decrease. By performing a black area enlargement process, the missing portion of the check box 511 can be filled in, allowing the IPU 500 to ensure high accuracy in detecting the vertex candidates 514 of the check box 511.

＜ＩＰＵ５００の作用効果＞
次に、ＩＰＵ５００の作用効果について説明する。 <Actions and Effects of IPU500>
Next, the effects of the IPU 500 will be described.

ＯＣＲ技術を用いた従来のソフトウェアでは、チェックボックスを個別に定義しないと、ＯＣＲ技術によりチェックボックスを正しく認識できない場合があった。またＯＣＲ技術により帳票を認識する場合には、チェックボックスが含まれる枠を定義するだけでは、チェックボックスを正しくできなかったり、認識精度が低下したりする場合があった。従って、チェックボックスの認識精度を上げるためには、帳票に含まれる罫線内に形成された複数のチェックボックスを、ユーザが個別に手作業によって定義する必要があり、手間がかかった。 With conventional software using OCR technology, check boxes could not be correctly recognized unless they were individually defined. Furthermore, when recognizing forms using OCR technology, simply defining the frame that contains the check boxes could result in check boxes not being recognized correctly or reduced recognition accuracy. Therefore, in order to improve check box recognition accuracy, users had to manually define each of the multiple check boxes formed within the ruled lines included in the form, which was time-consuming.

実施形態に係るＩＰＵ５００は、原稿Ｐｉの少なくとも一部が入力された入力画像Ｉｍ０に含まれ、項目に対する選択を受け付けるチェックボックス５１１（選択受付図形）を認識可能な画像認識装置である。またＩＰＵ５００は、入力画像Ｉｍ０に含まれる図形の外形形状の特徴点５１３に基づき、チェックボックスの頂点候補５１４の検出結果を出力する頂点候補出力部３５と、頂点候補出力部３５から出力される頂点候補５１４の検出結果に基づき、チェックボックス５１１の認識結果を出力する認識結果出力部３８と、を有する。例えばチェックボックス５１１は矩形の外形形状を有し、チェックボックス５１１の認識結果は、チェックボックス５１１における矩形の少なくとも１つの頂点位置を含む。 The IPU 500 according to the embodiment is an image recognition device capable of recognizing a check box 511 (selection acceptance figure) that accepts selection for an item and is included in an input image Im0 into which at least a portion of a document Pi has been input. The IPU 500 also has a vertex candidate output unit 35 that outputs a detection result of vertex candidates 514 of the check box based on feature points 513 of the outer shape of the figure included in the input image Im0, and a recognition result output unit 38 that outputs a recognition result of the check box 511 based on the detection result of the vertex candidates 514 output from the vertex candidate output unit 35. For example, the check box 511 has a rectangular outer shape, and the recognition result of the check box 511 includes the position of at least one vertex of the rectangle in the check box 511.

実施形態では、外形形状の特徴点５１３に基づく頂点候補５１４に基づいてチェックボックス５１１を認識するため、チェックボックス５１１の認識精度を向上させることができる。これにより、実施形態では、入力画像Ｉｍ０に含まれるチェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 In this embodiment, the check box 511 is recognized based on vertex candidates 514 based on feature points 513 of the outer shape, thereby improving the recognition accuracy of the check box 511. As a result, in this embodiment, an IPU 500 can be provided that has excellent recognition accuracy for the check box 511 included in the input image Im0.

なお、本実施形態では、選択受付図形として四角形の枠状のチェックボックス５１１を例示したが、これに限定されない。頂点を有する図形であれば、三角形や六角形等の四角形以外の多角形であってもよいし、枠状の図形でなくてもよい。 In this embodiment, a rectangular frame-shaped checkbox 511 is used as an example of the selection acceptance shape, but this is not limited to this. As long as the shape has vertices, it may be a polygon other than a rectangle, such as a triangle or hexagon, and it does not have to be a frame-shaped shape.

また、本実施形態では、頂点候補出力部３５は、複数の特徴点５１３同士を結ぶ画像領域に含まれる複数の画素のうち、画素値が該特徴点５１３に含まれる画素の画素値と一致する画素数に基づき、チェックボックス５１１の頂点候補５１４の検出結果を出力する。これにより、実施形態では、誤検出された頂点候補５１４を除去してチェックボックス５１１の頂点候補５１４を検出できるため、チェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 In addition, in this embodiment, the vertex candidate output unit 35 outputs the detection result of the vertex candidates 514 of the check box 511 based on the number of pixels included in the image area connecting the multiple feature points 513, whose pixel values match the pixel values of the pixels included in the feature points 513. As a result, in this embodiment, it is possible to remove erroneously detected vertex candidates 514 and detect the vertex candidates 514 of the check box 511, thereby providing an IPU 500 with excellent recognition accuracy for the check box 511.

また、本実施形態では、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる複数の画素それぞれの画素値から生成されるヒストグラムに基づく二値化閾値を用いて入力画像Ｉｍ０を二値化した二値化画像Ｉｍ１から抽出される特徴点５１３に基づき、チェックボックス５１１の頂点候補５１４を出力する。これにより、手書きのパターンが記入された原稿Ｐｉや、チェックボックス５１１上にノイズパターンが上書きされた原稿Ｐｉであっても、チェックボックス５１１の頂点候補５１４を検出できるため、チェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 In addition, in this embodiment, the vertex candidate output unit 35 outputs the vertex candidates 514 of the check box 511 based on feature points 513 extracted from the binarized image Im1 obtained by binarizing the input image Im0 using a binarization threshold based on a histogram generated from the pixel values of each of the multiple pixels included in the input image Im0. This makes it possible to detect the vertex candidates 514 of the check box 511 even in a document Pi on which a handwritten pattern has been written or a document Pi on which a noise pattern has been overwritten on the check box 511, thereby providing an IPU 500 with excellent recognition accuracy for the check box 511.

なお、頂点候補出力部３５は、入力画像Ｉｍ０の色成分情報およびヒストグラム情報の少なくとも一方に基づき、入力画像Ｉｍ０における手書き領域と手書き領域以外の領域とに区別された入力画像Ｉｍ０の特徴点５１３に基づき、チェックボックス５１１の頂点候補５１４を出力するように構成されてもよい。これにより、手書きによるパターンが記入された原稿Ｐｉであっても、チェックボックス５１１の頂点候補５１４を検出できるため、チェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 The vertex candidate output unit 35 may be configured to output vertex candidates 514 of the check box 511 based on feature points 513 of the input image Im0, which are distinguished into handwritten and non-handwritten areas in the input image Im0, based on at least one of the color component information and histogram information of the input image Im0. This makes it possible to detect vertex candidates 514 of the check box 511 even in a document Pi on which a handwritten pattern has been entered, thereby providing an IPU 500 with excellent recognition accuracy for the check box 511.

また、認識結果出力部３８は、チェックボックス５１１図形の大きさ情報および位置情報の少なくとも一方に基づき、チェックボックス５１１の認識結果を出力するように構成されてもよい。これにより、ＩＰＵ５００よりも後工程において処理を行う装置に対し、チェックボックス５１１を用いた選択を受け付ける対象項目５１０を含む出力画像Ｑを提供できる。 The recognition result output unit 38 may also be configured to output the recognition result of the check box 511 based on at least one of the size information and position information of the check box 511 figure. This makes it possible to provide an output image Q including target items 510 that accept selection using the check box 511 to a device that performs processing in a process downstream of the IPU 500.

また、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる罫線領域５１６から検出される入力画像Ｉｍ０の傾きが補正された入力画像Ｉｍ０の特徴点５１３に基づき、チェックボックス５１１の頂点候補５１４を出力するように構成されてもよい。これにより、傾いた原稿Ｐｉを読み取った入力画像Ｉｍ０であっても、チェックボックス５１１の頂点候補５１４を検出できるため、チェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 The vertex candidate output unit 35 may also be configured to output vertex candidates 514 of the check box 511 based on feature points 513 of the input image Im0, which has been corrected for tilt and is detected from the ruled line area 516 included in the input image Im0. This makes it possible to detect vertex candidates 514 of the check box 511 even for an input image Im0 obtained by scanning a tilted document Pi, thereby providing an IPU 500 with excellent recognition accuracy for the check box 511.

また、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる複数の画素それぞれの画素値が反転された入力画像Ｉｍ０の特徴点５１３に基づき、チェックボックス５１１の頂点候補５１４を出力するように構成されてもよい。これにより、色反転された画像が形成された原稿Ｐｉや、下地が着色された原稿Ｐｉであっても、チェックボックス５１１の頂点候補５１４を検出できるため、チェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 The vertex candidate output unit 35 may also be configured to output vertex candidates 514 of the check box 511 based on feature points 513 of the input image Im0, in which the pixel values of each of the multiple pixels contained in the input image Im0 are inverted. This makes it possible to detect the vertex candidates 514 of the check box 511 even in documents Pi on which a color-inverted image is formed or documents Pi with a colored background, thereby providing an IPU 500 with excellent recognition accuracy for the check box 511.

また、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる複数の画素それぞれの画素値を色成分に分解して取得される色成分画像における特徴点５１３に基づき、チェックボックス５１１の頂点候補５１４を出力するように構成されてもよい。これにより、カラーの原稿Ｐｉや、手書きによるパターンが記入された原稿Ｐｉであっても、チェックボックス５１１の頂点候補５１４を検出できるため、チェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 The vertex candidate output unit 35 may also be configured to output vertex candidates 514 of the check box 511 based on feature points 513 in a color component image obtained by decomposing the pixel values of each of the multiple pixels included in the input image Im0 into color components. This makes it possible to detect vertex candidates 514 of the check box 511 even in a color document Pi or a document Pi with a handwritten pattern, thereby providing an IPU 500 with excellent recognition accuracy for the check box 511.

また、頂点候補出力部３５は、入力画像Ｉｍ０の色成分情報およびヒストグラム情報の少なくとも一方に基づき、入力画像Ｉｍ０における帳票領域と帳票領域以外の領域とに区別された入力画像Ｉｍ０の特徴点５１３に基づき、チェックボックス５１１の頂点候補５１４を出力するように構成されてもよい。これにより、手書きによるパターンが記入された原稿Ｐｉや、チェックボックス５１１上にノイズパターンが上書きされた原稿Ｐｉであっても、チェックボックス５１１の頂点候補５１４を検出できるため、チェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 The vertex candidate output unit 35 may also be configured to output vertex candidates 514 of check boxes 511 based on feature points 513 of input image Im0, which distinguish between form regions and non-form regions in the input image Im0, based on at least one of the color component information and histogram information of the input image Im0. This makes it possible to detect vertex candidates 514 of check boxes 511 even in documents Pi on which handwritten patterns have been written or documents Pi on which check boxes 511 have been overwritten with noise patterns, thereby providing an IPU 500 with excellent recognition accuracy for check boxes 511.

また、頂点候補出力部３５は、入力画像Ｉｍ０に含まれる黒色領域が拡大された入力画像Ｉｍ０の特徴点５１３に基づき、チェックボックス５１１の頂点候補５１４を出力するように構成されてもよい。これにより、原稿Ｐｉに形成された画像の劣化により、チェックボックス５１１の一部が欠けている原稿Ｐｉであっても、チェックボックス５１１の頂点候補５１４を検出できるため、チェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 The vertex candidate output unit 35 may also be configured to output vertex candidates 514 of the check box 511 based on feature points 513 of the input image Im0, which is an enlarged version of the black area contained in the input image Im0. This makes it possible to detect the vertex candidates 514 of the check box 511 even in documents Pi in which part of the check box 511 is missing due to deterioration of the image formed on the document Pi, thereby providing an IPU 500 with excellent recognition accuracy for the check box 511.

［その他の好適な実施形態］
その他の実施形態について説明する。なお、上述した実施形態と同一の構成部には、同一の符号を付し、重複する説明を適宜省略する。 [Other Preferred Embodiments]
Other embodiments will be described below. Note that the same components as those in the above-described embodiment are denoted by the same reference numerals, and redundant description will be omitted as appropriate.

図２３は、その他の実施形態に係る出力画像Ｑａを示す図である。出力画像Ｑａは、頂点候補出力部３５から出力される頂点候補５１４の検出結果に基づいて決定される、チェックボックス５１１を指示する認識マーク５２１の周辺領域５２４を含む。認識結果出力部３８は、このような出力画像Ｑａを出力することにより、手書きされたマークがチェックボックス５１１内に収まらない場合にも、チェックボックス５１１を認識できるため、チェックボックス５１１の認識精度に優れたＩＰＵ５００を提供できる。 Figure 23 is a diagram showing an output image Qa according to another embodiment. The output image Qa includes a peripheral area 524 of a recognition mark 521 indicating a check box 511, which is determined based on the detection results of the vertex candidates 514 output from the vertex candidate output unit 35. By outputting such an output image Qa, the recognition result output unit 38 can recognize the check box 511 even if the handwritten mark does not fit within the check box 511, thereby providing an IPU 500 with excellent recognition accuracy for the check box 511.

図２４は、その他の実施形態に係るＩＰＵ５００ｂの機能構成の一例を示すブロック図である。図２４に示すように、ＩＰＵ５００ｂは、誤認識評価部３９を有する。 Figure 24 is a block diagram showing an example of the functional configuration of an IPU 500b according to another embodiment. As shown in Figure 24, the IPU 500b has an error recognition evaluation unit 39.

誤認識評価部３９は、認識結果出力部３８から出力されるチェックボックス５１１の認識結果に基づき、チェックボックス５１１以外の文字および図形の少なくとも一方の検出結果を出力する。 The misrecognition evaluation unit 39 outputs the detection results of at least one of the characters and figures other than the check box 511 based on the recognition result of the check box 511 output from the recognition result output unit 38.

図２５は、その他の実施形態に係る出力画像Ｑｂを示す図である。図２５における出力画像Ｑｂ'は、チェックボックス５１１以外の文字である「団」がチェックボックスとして誤認識された誤認識チェックボックス５２５を含んでいる。 Figure 25 shows an output image Qb according to another embodiment. The output image Qb' in Figure 25 includes an incorrectly recognized check box 525 in which the character "団" (group), which is not a check box 511, is incorrectly recognized as a check box.

一方、図２５における出力画像Ｑｂは、誤認識チェックボックス５２５が誤認識であることを示す誤認識マーク５２６を含んでいる。この誤認識マーク５２６は、誤認識評価部３９により、認識結果出力部３８から出力されたチェックボックス５１１の認識結果に基づいて検出され、誤認識評価部３９により付与されたマークである。このようにして、ＩＰＵ５００ｂは、チェックボックス５１１以外の文字および図形による誤認識を防ぎ、チェックボックス５１１の認識精度を向上させることができる。 On the other hand, output image Qb in Figure 25 includes an error recognition mark 526 indicating that the error recognition check box 525 has been erroneously recognized. This error recognition mark 526 is a mark that is detected by the error recognition evaluation unit 39 based on the recognition result of the check box 511 output from the recognition result output unit 38, and is added by the error recognition evaluation unit 39. In this way, IPU 500b can prevent erroneous recognition due to characters and figures other than the check box 511, and improve the recognition accuracy of the check box 511.

以上、好ましい実施の形態について詳説したが、上述した実施の形態に制限されることはなく、特許請求の範囲に記載された範囲を逸脱することなく、上述した実施の形態に種々の変形及び置換を加えることができる。 The above describes a preferred embodiment in detail, but the present invention is not limited to the above embodiment, and various modifications and substitutions can be made to the above embodiment without departing from the scope of the claims.

上述した実施形態では、トナーを用いる電子写真方式の画像形成装置を例示したが、これに限定されるものではなく、インク等の液体を用いる液体吐出方式の画像形成装置にも実施形態を適用可能である。インクにおいてもトナーと同様に記録媒体への付着ムラ、飛散、滲み等が生じるため、トナーを用いる場合と同様の作用効果を得ることができる。 In the above-described embodiment, an electrophotographic image forming apparatus using toner was exemplified, but the present invention is not limited to this, and the embodiment can also be applied to a liquid ejection image forming apparatus using a liquid such as ink. Similar to toner, ink can also have uneven adhesion to the recording medium, scattering, bleeding, etc., and therefore can achieve the same effects as when toner is used.

また、実施形態は、画像認識方法を含む。例えば、画像認識方法は、原稿の少なくとも一部が入力された入力画像に含まれ、項目に対する選択を受け付ける選択受付図形を認識可能な画像認識装置による画像認識方法であって、前記画像認識装置が、頂点候補出力部により、前記入力画像に含まれる図形の外形形状の特徴点に基づき、前記選択受付図形の頂点候補の検出結果を出力し、認識結果出力部により、前記頂点候補出力部から出力される前記頂点候補の検出結果に基づき、前記選択受付図形の認識結果を出力する。このような画像認識方法により、上述したＩＰＵ５００と同様の効果を得ることができる。 Embodiments also include an image recognition method. For example, the image recognition method is an image recognition method using an image recognition device that is included in an input image in which at least a portion of a document is input and that is capable of recognizing a selection acceptance figure that accepts selections for items, in which the image recognition device outputs, via a vertex candidate output unit, a detection result of vertex candidates for the selection acceptance figure based on feature points of the outer shape of the figure included in the input image, and outputs, via a recognition result output unit, a recognition result of the selection acceptance figure based on the detection result of the vertex candidates output from the vertex candidate output unit. Such an image recognition method can achieve the same effects as the IPU 500 described above.

実施形態の説明で用いた序数、数量等の数字は、全て本発明の技術を具体的に説明するために例示するものであり、本発明は例示された数字に制限されない。また、構成要素間の接続関係は、本発明の技術を具体的に説明するために例示するものであり、本発明の機能を実現する接続関係をこれに限定するものではない。 All ordinal numbers, quantities, and other figures used in the description of the embodiments are provided as examples to specifically explain the technology of the present invention, and the present invention is not limited to these figures. Furthermore, the interconnections between components are provided as examples to specifically explain the technology of the present invention, and do not limit the interconnections that realize the functions of the present invention.

実施形態の各機能は、一又は複数の処理回路によって実現することが可能である。ここで、本明細書における「処理回路」とは、電子回路により実装されるプロセッサのようにソフトウェアによって各機能を実行するようプログラミングされたプロセッサや、上記で説明した各機能を実行するよう設計されたASIC(Application Specific Integrated Circuit)、DSP（digital signal processor）、FPGA（field programmable gate array）や従来の回路モジュール等のデバイスを含むものとする。 Each function of the embodiments can be realized by one or more processing circuits. Here, the term "processing circuit" as used herein includes processors programmed to perform each function by software, such as processors implemented as electronic circuits, as well as devices such as ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), FPGAs (Field Programmable Gate Arrays), and conventional circuit modules designed to perform each of the functions described above.

１画像形成装置
２ＰＣ
３サーバー
１０、２０ＣＰＵ
１１、２１ＲＡＭ
１２、２２ＲＯＭ
１３、２３ＨＤＤ
１４、２４通信Ｉ／Ｆ
１５、２５バス
１６表示装置
１７入力操作部
４１ＡＤＦ
４２スキャナユニット
４３排紙トレイ
３１入力部
３２二値化処理部
３３外形形状抽出部
３４特徴点抽出部
３５頂点候補出力部
３６頂点候補評価部
３７登録部
３８認識結果出力部
３９誤認識評価部
１００スキャナ
２００コントローラ
２０４第１ミラーユニット
２１０第２ミラーユニット
２１４光電変換素子
２１５第１センサボード
２１６レンズ
３００プロッタ
３３１露光装置
３３２感光体ドラム
３３３現像装置
３３４転写ベルト
３３５定着装置
４０１画像メモリ
４２１、４２２給紙カセット
４２３給紙手段
５００ＩＰＵ（画像認識装置の一例）
５１０対象項目
５１１、５１１ａ、５１１ｂチェックボックス（選択受付図形の一例）
５１２外形形状
５１３特徴点
５１４、５１４ａ、５１４ｂ頂点候補
５１５誤検出頂点候補
５１６罫線領域
５２１認識マーク
５２２チェックマーク
５２３項目マーク
５２４周辺領域
５２５誤認識チェックボックス
５２６誤認識マーク
Ｉｍ０入力画像
Ｉｍ１二値化画像
Ｉｍ２外形形状画像
ＮＷネットワーク
Ｐｉ原稿
Ｐｏ印刷物
Ｑ出力画像 1 Image forming device 2 PC
3 Servers 10 and 20 CPU
11, 21 RAM
12, 22 ROM
13, 23 HDD
14, 24 Communication I/F
15, 25 Bus 16 Display device 17 Input operation unit 41 ADF
42 Scanner unit 43 Paper output tray 31 Input unit 32 Binarization processing unit 33 Outer shape extraction unit 34 Feature point extraction unit 35 Vertex candidate output unit 36 Vertex candidate evaluation unit 37 Registration unit 38 Recognition result output unit 39 Error recognition evaluation unit 100 Scanner 200 Controller 204 First mirror unit 210 Second mirror unit 214 Photoelectric conversion element 215 First sensor board 216 Lens 300 Plotter 331 Exposure device 332 Photosensitive drum 333 Developing device 334 Transfer belt 335 Fixing device 401 Image memories 421, 422 Paper feed cassette 423 Paper feed means 500 IPU (an example of an image recognition device)
510: Target items 511, 511a, 511b: Check boxes (examples of selection acceptance shapes)
512 Outer shape 513 Feature points 514, 514a, 514b Vertex candidate 515 Falsely detected vertex candidate 516 Ruled line area 521 Recognition mark 522 Check mark 523 Item mark 524 Peripheral area 525 Falsely recognized check box 526 Falsely recognized mark Im0 Input image Im1 Binarized image Im2 Outer shape image NW Network Pi Original Po Printed matter Q Output image

特開２０２１－０３９４２９号公報Japanese Patent Application Laid-Open No. 2021-039429

Claims

An image recognition device capable of recognizing a selection acceptance figure that accepts a selection for an item, the selection acceptance figure being included in an input image that includes at least a part of a document,
a vertex candidate output unit that outputs a detection result of vertex candidates of the selection receiving figure based on feature points of the outer shape of the figure included in the input image;
a vertex candidate evaluation unit that determines whether or not the vertex candidate is a vertex candidate of the selection receiving figure based on the vertex candidate detection result received from the vertex candidate output unit;
a recognition result output unit that outputs a recognition result of the selection receiving figure based on a determination result by the vertex candidate evaluation unit ,
The vertex candidate evaluation unit determines whether the vertex candidate is a vertex candidate of the selection receiving figure by determining whether the straight lines connecting the multiple vertex candidates are colored, or by comparing the distance between the vertex candidates with the size of the characters contained in the input image .

the selection receiving figure has a rectangular outer shape,
The image recognition device according to claim 1 , wherein the recognition result of the selection receiving figure includes the position of at least one vertex of the rectangle in the selection receiving figure.

The image recognition device of claim 1 or 2, wherein the vertex candidate output unit outputs the detection result of the vertex candidate based on the number of pixels included in an image region connecting the feature points, the pixel values of which match the pixel values of pixels included in the feature points.

The image recognition device of any one of claims 1 to 3, wherein the vertex candidate output unit outputs the vertex candidates based on the feature points extracted from a binarized image obtained by binarizing the input image using a binarization threshold based on a histogram generated from the pixel values of each of a plurality of pixels included in the input image.

The image recognition device described in any one of claims 1 to 4, wherein the vertex candidate output unit outputs the vertex candidates based on the feature points of the input image that are distinguished into handwritten areas and non-handwritten areas based on at least one of color component information and histogram information of the input image.

The image recognition device of any one of claims 1 to 5, wherein the recognition result output unit outputs the recognition result of the selection acceptance figure based on at least one of size information and position information of the selection acceptance figure.

The image recognition device described in any one of claims 1 to 6, wherein the vertex candidate output unit outputs the vertex candidates based on the feature points of the input image after the tilt of the input image has been corrected and detected from a ruled line region included in the input image.

The image recognition device described in any one of claims 1 to 7, wherein the vertex candidate output unit outputs the vertex candidates based on the feature points of the input image in which the pixel values of each of the multiple pixels included in the input image have been color-inverted.

The image recognition device described in any one of claims 1 to 8, wherein the vertex candidate output unit outputs the vertex candidates based on the feature points in a color component image obtained by decomposing the pixel values of each of a plurality of pixels included in the input image into color components.

The image recognition device described in any one of claims 1 to 9, wherein the vertex candidate output unit outputs the vertex candidates based on the feature points of the input image, which are distinguished into form regions and regions other than the form regions based on at least one of color component information and histogram information of the input image.

The image recognition device described in any one of claims 1 to 10, wherein the vertex candidate output unit outputs the vertex candidates based on the feature points in the input image in which a black area included in the input image has been enlarged.

The image recognition device of any one of claims 1 to 11, wherein the recognition result output unit outputs the recognition result of the selection acceptance figure, including the peripheral area of a recognition mark indicating the selection acceptance figure, determined based on the vertex candidate detection result output from the vertex candidate output unit.

The image recognition device of any one of claims 1 to 12, further comprising an error recognition evaluation unit that outputs a detection result of at least one of characters and figures other than the selection acceptance figure based on the recognition result of the selection acceptance figure output from the recognition result output unit.

An image recognition method using an image recognition device capable of recognizing a selection acceptance graphic that accepts a selection for an item, the selection acceptance graphic being included in an input image that includes at least a part of a document, the image recognition device comprising:
a vertex candidate output unit outputs a detection result of vertex candidates of the selection receiving figure based on feature points of the outer shape of the figure included in the input image;
a vertex candidate evaluation unit determines whether or not the vertex candidate is a vertex candidate of the selection receiving figure based on the detection result of the vertex candidate received from the vertex candidate output unit;
a recognition result output unit outputs a recognition result of the selection receiving figure based on the vertex candidate detection result output from the vertex candidate output unit ;
The vertex candidate evaluation unit determines whether the vertex candidate is a vertex candidate of the selection receiving figure by determining whether the straight lines connecting the multiple vertex candidates are colored, or by comparing the distance between the vertex candidates with the size of the characters contained in the input image .