JP6989790B2

JP6989790B2 - Information processing system, its control method, and program

Info

Publication number: JP6989790B2
Application number: JP2019154188A
Authority: JP
Inventors: 浩輝大野; 裕真中村; 萌若林
Original assignee: Canon Marketing Japan Inc
Current assignee: Canon Marketing Japan Inc
Priority date: 2019-08-27
Filing date: 2019-08-27
Publication date: 2022-01-12
Anticipated expiration: 2039-08-27
Also published as: JP2021033721A

Description

本発明は、情報処理システム、その制御方法、及びプログラムに関する。 The present invention relates to an information processing system, a control method thereof, and a program.

従来、画像化された文字のパターンを解析し、文字データとして出力する文字認識技術がある。申込用紙や帳票などに記載された文字の画像を文字データとして取得することで、事務作業などの業務を効率化することが可能となる。 Conventionally, there is a character recognition technique that analyzes an imaged character pattern and outputs it as character data. By acquiring the image of the characters written on the application form or the form as character data, it is possible to improve the efficiency of office work and the like.

また、近年ではこの文字認識技術の精度をさらに向上させるため、機械学習を用いて文字認識及び文字データの抽出を行う仕組みが提供されている。 Further, in recent years, in order to further improve the accuracy of this character recognition technique, a mechanism for performing character recognition and character data extraction using machine learning has been provided.

特許文献１には、異なる学習をさせる機械学習器を複数用いて、請求書などの帳票から文字情報を抽出する精度を向上させる仕組みが記載されている。 Patent Document 1 describes a mechanism for improving the accuracy of extracting character information from a form such as an invoice by using a plurality of machine learning devices that perform different learning.

特開２０１９－８２８１４号公報Japanese Unexamined Patent Publication No. 2019-82814

ところで、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）処理により文字認識を行う際に、読み込む画像が劣化した画像だとうまく処理できないという問題が発生する。例えば、文字認識を行う画像が、帳票をカメラで撮影した画像であったり、スキャンした画像であったりすると、画像の一部分に影や白飛びが発生しうまく文字認識ができない場合がある。 By the way, when character recognition is performed by OCR (Optical Character Recognition) processing, there arises a problem that if the image to be read is a deteriorated image, it cannot be processed well. For example, if the image for character recognition is an image obtained by taking a form with a camera or a scanned image, shadows or overexposure may occur in a part of the image and character recognition may not be performed properly.

そこで、このような文字認識が難しい画像を劣化画像として学習装置に学習させ、その上でＯＣＲ処理することで、ＯＣＲ処理の認識精度、及び出力結果の精度向上を図ることが可能となる。 Therefore, by having a learning device learn such an image for which character recognition is difficult as a deteriorated image and then performing OCR processing on the image, it is possible to improve the recognition accuracy of the OCR processing and the accuracy of the output result.

しかしながら、機械学習を行うためには、学習させるための多くの画像データが必要となる。画像の劣化のパターンはさまざまあるため複数の劣化パターンごとに多くの劣化画像を収集し用意することはユーザにとって非常に手間となる。 However, in order to perform machine learning, a lot of image data for learning is required. Since there are various deterioration patterns of images, it is very troublesome for the user to collect and prepare many deteriorated images for each of a plurality of deterioration patterns.

そこで、本発明は、劣化傾向の分析を行うために必要な画像を容易に生成可能な仕組みを提供することを目的とする。 Therefore, an object of the present invention is to provide a mechanism capable of easily generating an image necessary for analyzing a deterioration tendency.

本発明の情報処理システムは、画像の劣化に関する情報を含む第一の画像を取得する取得手段と、前記取得手段で取得した第一の画像を、学習モデルに学習用データとして学習させる学習手段と、前記第一の画像とは異なる画像であって、スキャンされた第二の画像を前記学習モデルに入力することで、当該第二の画像と前記学習モデルで学習した結果とを用いた第三の画像を生成する生成手段と、を備えることを特徴とする。 The information processing system of the present invention includes an acquisition means for acquiring a first image including information on image deterioration, and a learning means for causing a learning model to learn the first image acquired by the acquisition means as training data. , A third image different from the first image, and by inputting the scanned second image into the training model, the second image and the result learned by the training model are used. It is characterized by comprising a generation means for generating an image of.

本発明によれば、劣化傾向の分析を行うために必要な画像を容易に生成可能な仕組みを提供することができる。 According to the present invention, it is possible to provide a mechanism capable of easily generating an image necessary for analyzing a deterioration tendency.

本発明の実施形態に係るサーバ装置１０２を含む情報処理システムの構成の一例を示す図である。It is a figure which shows an example of the structure of the information processing system including the server apparatus 102 which concerns on embodiment of this invention. 本発明の実施形態に係るサーバ装置１０２のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware composition of the server apparatus 102 which concerns on embodiment of this invention. 本発明の実施形態に係るサーバ装置１０２の機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of the server apparatus 102 which concerns on embodiment of this invention. 本発明の実施形態に係る劣化画像学習モデル生成処理と劣化画像生成処理の一例を示すフローチャートの図である。It is a figure of the flowchart which shows an example of the deterioration image learning model generation processing and deterioration image generation processing which concerns on embodiment of this invention. 本発明の実施形態に係るＯＣＲモデル学習処理の一例を示すフローチャートの図である。It is a figure of the flowchart which shows an example of the OCR model learning process which concerns on embodiment of this invention. 本発明の実施形態に係る画像のＯＣＲ処理における前処理からＯＣＲ結果出力までの全体像の処理の一例を示すフローチャートの図である。It is a figure of the flowchart which shows an example of the processing of the whole image from the preprocessing to the OCR result output in the OCR processing of the image which concerns on embodiment of this invention. 本発明の実施形態に係る画像のＯＣＲ処理における前処理からＯＣＲ結果出力までの詳細な処理の一例を示すフローチャートの図である。It is a figure of the flowchart which shows an example of the detailed processing from the preprocessing to the OCR result output in the OCR processing of the image which concerns on embodiment of this invention. 本発明の実施形態に係るグループ別前処理パターン８００の一例を示す図である。It is a figure which shows an example of the pretreatment pattern 800 by group which concerns on embodiment of this invention. 本発明の実施形態に係る前処理設定画面９００の一例を示す図である。It is a figure which shows an example of the pre-processing setting screen 900 which concerns on embodiment of this invention.

以下、図面を参照して本発明の実施の形態について詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１に示すように、本実施形態の情報処理システム１００は、情報処理装置１０１、及びサーバ装置１０２が、ＬＡＮ１０３により通信可能に接続された構成をとる。情報処理装置１０１は、カメラでの撮像や画像形成装置によるスキャン等を用いて文書の画像を取り込み、画像をサーバ装置１０２に送信する。サーバ装置１０２は、画像の解析を行い、解析結果を情報処理装置１０１に送信する。またサーバ装置１０２は、情報処理装置１０１から取得した画像をもとに、劣化画像を生成するための情報を学習し、学習した情報をもとに劣化画像を生成して情報処理装置１０１に送信する。さらに、サーバ装置１０２は、情報処理装置１０１から取得した画像をもとに、当該画像をあらかじめ前処理が既定されたグループに分別し、該当グループの前処理を行い、前処理を行った画像に対してＯＣＲを行い、ＯＣＲ結果を情報処理装置１０１に送信する。情報処理装置１０１は、携帯電話、スマートフォン、タブレット端末、ノートＰＣ、及びＰＤＡ端末などの持ち運び可能な携帯端末であってもよい。情報処理装置１０１は、ブラウザまたは画像解析アプリケーションがインストールされており、ＬＡＮ１０３を介して、サーバ装置１０２と通信可能である。 As shown in FIG. 1, the information processing system 100 of the present embodiment has a configuration in which an information processing device 101 and a server device 102 are communicably connected by a LAN 103. The information processing device 101 captures an image of a document by taking an image with a camera, scanning with an image forming device, or the like, and transmits the image to the server device 102. The server device 102 analyzes the image and transmits the analysis result to the information processing device 101. Further, the server device 102 learns information for generating a deteriorated image based on the image acquired from the information processing device 101, generates a deteriorated image based on the learned information, and transmits the deteriorated image to the information processing device 101. do. Further, the server device 102 sorts the image into a group for which preprocessing is predetermined based on the image acquired from the information processing device 101, performs preprocessing for the group, and converts the image into the preprocessed image. OCR is performed on the object, and the OCR result is transmitted to the information processing apparatus 101. The information processing device 101 may be a portable mobile terminal such as a mobile phone, a smartphone, a tablet terminal, a notebook PC, and a PDA terminal. The information processing device 101 has a browser or an image analysis application installed on it, and can communicate with the server device 102 via the LAN 103.

尚、本実施形態における情報処理システムのシステム構成は一例であり、情報処理装置１０１とサーバ装置１０２以外のその他の構成例があってもよい。 The system configuration of the information processing system in this embodiment is an example, and there may be other configuration examples other than the information processing device 101 and the server device 102.

次に、図２を用いてサーバ装置１０２に適用可能なハードウェア構成の一例について説明する。 Next, an example of the hardware configuration applicable to the server device 102 will be described with reference to FIG.

ＣＰＵ２０１は、システムバス２０４に接続される各デバイスやコントローラを統括的に制御する。 The CPU 201 comprehensively controls each device and controller connected to the system bus 204.

また、ＲＯＭ２０２あるいは外部メモリ２１１には、ＣＰＵ２０１の制御プログラムであるＢＩＯＳ(ＢａｓｉｃＩｎｐｕｔ／ＯｕｔｐｕｔＳｙｓｔｅｍ)やオペレーティングシステムプログラム（以下、ＯＳ）等が記憶されている。。また、ＲＯＭ２０２あるいは外部メモリ２１１には、各サーバ或いは各ＰＣの実行する機能を実現するために必要な各種プログラム等が記憶されている。ＲＡＭ２０３は、ＣＰＵ２０１の主メモリ、ワークエリア等として機能する。 Further, the ROM 202 or the external memory 211 stores a BIOS (Basic Input / Output System), which is a control program of the CPU 201, an operating system program (hereinafter, OS), and the like. .. Further, the ROM 202 or the external memory 211 stores various programs and the like necessary for realizing the functions executed by each server or each PC. The RAM 203 functions as a main memory, a work area, and the like of the CPU 201.

ＣＰＵ２０１は、処理の実行に際して必要なプログラム等をＲＡＭ２０３にロードして、プログラムを実行することで各種動作を実現するものである。 The CPU 201 realizes various operations by loading a program or the like necessary for executing a process into the RAM 203 and executing the program.

また、入力コントローラ（入力Ｃ）２０５は、キーボード等の入力デバイス２０９や不図示のマウス等のポインティングデバイスからの入力を制御する。 Further, the input controller (input C) 205 controls an input from an input device 209 such as a keyboard or a pointing device such as a mouse (not shown).

ビデオコントローラ（ＶＣ）２０６は、ディスプレイ２１０等の表示器への表示を制御する。表示器はＣＲＴや液晶ディスプレイでも構わない。 The video controller (VC) 206 controls the display on a display such as the display 210. The display may be a CRT or a liquid crystal display.

メモリコントローラ（ＭＣ）２０７は、ブートプログラム、ブラウザソフトウエア、各種のアプリケーション、フォントデータ、ユーザファイル、編集ファイル、各種データ等を記憶するハードディスク（ＨＤ）等の外部メモリ２１１へのアクセスを制御する。また、メモリコントローラ（ＭＣ）２０７は、フレキシブルディスク（ＦＤ）或いはＰＣＭＣＩＡカードスロットにアダプタを介して接続されるカード型メモリ等の外部メモリ２１１へのアクセスを制御する。 The memory controller (MC) 207 controls access to an external memory 211 such as a hard disk (HD) that stores a boot program, browser software, various applications, font data, user files, edit files, various data, and the like. Further, the memory controller (MC) 207 controls access to an external memory 211 such as a card type memory connected to the flexible disk (FD) or the PCMCIA card slot via an adapter.

通信Ｉ／Ｆコントローラ（通信Ｉ／ＦＣ）２０８は、ネットワークを介して、外部機器と接続・通信するものであり、ネットワークでの通信制御処理を実行する。例えば、ＴＣＰ／ＩＰを用いたインターネット通信等が可能である。 The communication I / F controller (communication I / FC) 208 connects and communicates with an external device via a network, and executes communication control processing on the network. For example, Internet communication using TCP / IP is possible.

尚、ＣＰＵ２０１は、例えばＲＡＭ２０３内の表示情報用領域へアウトラインフォントの展開（ラスタライズ）処理を実行することにより、ディスプレイ２１０上での表示を可能としている。また、ＣＰＵ２０１は、ディスプレイ２１０上の不図示のマウスカーソル等でのユーザ指示を可能とする。 The CPU 201 enables display on the display 210 by, for example, executing an outline font expansion (rasterization) process in the display information area in the RAM 203. Further, the CPU 201 enables a user instruction with a mouse cursor or the like (not shown) on the display 210.

ＧＰＵ２１２はデータをより多く並列処理することで効率的な演算を行うことができるため、ディープラーニングのような学習モデルを用いて複数回に渡り学習を行う場合にはＧＰＵ２１２で処理を行うことが有効である。そこで本発明の実施形態では、学習モデル生成部３０３や機械学習部３０５による処理にはＣＰＵ２０１に加えてＧＰＵ２１２を用いる。具体的には、学習モデルを含む学習プログラムを実行する場合に、ＣＰＵ２０１とＧＰＵ２１２が協働して演算を行うことで学習を行う。なお、学習モデル生成部３０３や機械学習部３０５による処理はＣＰＵ２０１またはＧＰＵ２１２のみにより演算が行われても良い。また、劣化パターン分析部３０２や画像解析部３０６にも同様にＧＰＵ２１２を用いても良い。 Since the GPU 212 can perform efficient calculations by processing more data in parallel, it is effective to perform the processing with the GPU 212 when learning is performed multiple times using a learning model such as deep learning. Is. Therefore, in the embodiment of the present invention, the GPU 212 is used in addition to the CPU 201 for the processing by the learning model generation unit 303 and the machine learning unit 305. Specifically, when a learning program including a learning model is executed, learning is performed by the CPU 201 and the GPU 212 collaborating to perform an operation. The processing by the learning model generation unit 303 and the machine learning unit 305 may be performed only by the CPU 201 or the GPU 212. Further, the GPU 212 may be similarly used for the deterioration pattern analysis unit 302 and the image analysis unit 306.

本発明の各種装置及びサーバが後述する各種処理を実行するために用いられる各種プログラム等は外部メモリ２１１に記録されており、必要に応じてＲＡＭ２０３にロードされることによりＣＰＵ２０１やＧＰＵ２１２によって実行されるものである。さらに、本発明に係わるプログラムが用いる定義ファイルや各種情報テーブルは外部メモリ２１１に格納されている。 Various programs and the like used by the various devices and servers of the present invention to execute various processes described later are recorded in the external memory 211, and are executed by the CPU 201 and the GPU 212 by being loaded into the RAM 203 as needed. It is a thing. Further, the definition file and various information tables used by the program according to the present invention are stored in the external memory 211.

図３は、サーバ装置１０２の機能構成の一例を示すブロック図である。 FIG. 3 is a block diagram showing an example of the functional configuration of the server device 102.

サーバ装置１０２は、画像取得部３０１、劣化パターン分析部３０２、学習モデル生成部３０３、劣化画像生成部３０４、機械学習部３０５を備える。また、サーバ装置１０２は、画像解析部３０６、前処理部３０７、前処理画像生成部３０８、ＯＣＲ処理部３０９を備える。 The server device 102 includes an image acquisition unit 301, a deterioration pattern analysis unit 302, a learning model generation unit 303, a deterioration image generation unit 304, and a machine learning unit 305. Further, the server device 102 includes an image analysis unit 306, a preprocessing unit 307, a preprocessing image generation unit 308, and an OCR processing unit 309.

画像取得部３０１は、カメラやスキャナ等を用いて文書や帳票の画像データを情報処理装置１０１から取得する機能部である。 The image acquisition unit 301 is a functional unit that acquires image data of a document or a form from the information processing apparatus 101 using a camera, a scanner, or the like.

劣化パターン分析部３０２は、情報処理装置１０１から取得した画像の劣化パターンを分析する機能部である。画像の劣化パターンとは、影による劣化や光の反射、白飛びなどによる劣化、書類の折れやたわみによる文字のゆがみが原因となる劣化等がある。劣化パターン分析部３０２では、取得した画像がこれらの劣化パターンのどれに当てはまるかを画像の輝度やコントラスト、傾きといった情報から分析する。 The deterioration pattern analysis unit 302 is a functional unit that analyzes the deterioration pattern of the image acquired from the information processing apparatus 101. Image deterioration patterns include deterioration due to shadows, light reflection, deterioration due to overexposure, deterioration due to character distortion due to document folding or bending, and the like. The deterioration pattern analysis unit 302 analyzes which of these deterioration patterns the acquired image applies to from information such as the brightness, contrast, and inclination of the image.

学習モデル生成部３０３は、劣化パターン分析部３０２で分析した情報を用いて劣化画像を生成するための学習モデルを生成する機能部である。学習モデル生成部３０３で生成された学習モデルを用いて、サーバ装置１０２は、特定の画像から劣化画像を生成することが可能となる。 The learning model generation unit 303 is a functional unit that generates a learning model for generating a deterioration image using the information analyzed by the deterioration pattern analysis unit 302. Using the learning model generated by the learning model generation unit 303, the server device 102 can generate a deteriorated image from a specific image.

劣化画像生成部３０４は、画像取得部３０１で取得した特定の画像と学習モデル生成部３０３とを用いて劣化画像を生成する機能部である。劣化画像生成部３０４では、生成した劣化画像と画像取得部３０１で取得、もしくはあらかじめ記憶している比較用の劣化画像とを比較して所望の劣化画像が生成できているかを確認する。 The deteriorated image generation unit 304 is a functional unit that generates a deteriorated image by using the specific image acquired by the image acquisition unit 301 and the learning model generation unit 303. The deteriorated image generation unit 304 compares the generated deteriorated image with the deteriorated image for comparison acquired by the image acquisition unit 301 or stored in advance, and confirms whether or not the desired deteriorated image can be generated.

機械学習部３０５は、学習モデル生成部３０３で生成した学習モデルに劣化画像を投入することで劣化画像の劣化の特徴を学習して機械学習を行う機能部である。 The machine learning unit 305 is a functional unit that learns the deterioration characteristics of the deteriorated image and performs machine learning by inputting the deteriorated image into the learning model generated by the learning model generation unit 303.

画像解析部３０６は、画像取得部３０１で取得した画像を解析することにより、画像の傾向を分析してグループ別の前処理部に振り分ける機能部である。 The image analysis unit 306 is a functional unit that analyzes the tendency of the image by analyzing the image acquired by the image acquisition unit 301 and distributes it to the preprocessing unit for each group.

前処理部３０７は、画像取得部３０１で取得した画像に対して前処理を行う機能部である。ここで言う前処理とは、ＯＣＲを行う画像に対してコントラストの調整やノイズ除去、二値化などを行う処理のことを示し、ＯＣＲの認識精度を上げるために画像に対して行う加工処理のことを示す。 The pre-processing unit 307 is a functional unit that performs pre-processing on the image acquired by the image acquisition unit 301. The pre-processing referred to here refers to processing for adjusting contrast, removing noise, binarizing, etc. for an image to be OCRed, and is a processing process performed on an image in order to improve the recognition accuracy of OCR. Show that.

前処理画像生成部３０８は、前処理部３０７で前処理を行った結果である前処理画像を生成する機能部である。画像取得部３０１で取得した画像に対して複数の前処理を行う場合は取得した画像を複製してそれぞれ異なる前処理を行うため、サーバ装置１０２は一つの画像から異なる前処理を行った複数の前処理画像を生成することができる。 The pre-processed image generation unit 308 is a functional unit that generates a pre-processed image that is the result of pre-processing by the pre-processing unit 307. When performing a plurality of preprocessing on the image acquired by the image acquisition unit 301, the acquired image is duplicated and different preprocessing is performed. Therefore, the server device 102 performs different preprocessing from one image. Preprocessed images can be generated.

ＯＣＲ処理部３０９は、前処理部３０７で前処理を行った前処理画像に対してＯＣＲ処理を行う機能部である。ＯＣＲ処理部３０９では複数のＯＣＲエンジンを備えていてもよく、本実施形態ではＯＣＲエンジンＡ、ＯＣＲエンジンＢ、ＯＣＲエンジンＣの３つのエンジンを備えている。尚、ＯＣＲエンジンの数は１つでも複数でもよい。 The OCR processing unit 309 is a functional unit that performs OCR processing on the preprocessed image that has been preprocessed by the preprocessing unit 307. The OCR processing unit 309 may include a plurality of OCR engines, and in the present embodiment, the OCR processing unit 309 includes three engines, an OCR engine A, an OCR engine B, and an OCR engine C. The number of OCR engines may be one or more.

次に、図４のフローチャートを用いて、本実施形態に係る劣化画像学習モデル生成処理と劣化画像生成処理の一例について説明する。劣化画像学習モデル生成処理は、劣化画像を生成する際に用いる学習モデルを生成する処理の一例であり、劣化画像生成処理は、劣化画像学習モデル生成処理で生成した劣化画像学習モデルを用いて、劣化画像を自動生成する処理の一例を示す。 Next, an example of the deteriorated image learning model generation process and the deteriorated image generation process according to the present embodiment will be described with reference to the flowchart of FIG. The degraded image learning model generation process is an example of a process of generating a learning model used when generating a degraded image, and the degraded image generation process uses a degraded image learning model generated by the degraded image learning model generation process. An example of the process of automatically generating a deteriorated image is shown.

ステップＳ４０１では、サーバ装置１０２のＣＰＵ２０１は、画像取得部３０１の機能により、情報処理装置１０１から劣化画像を取得する。本実施形態では情報処理装置１０１から劣化画像を取得するが、取得元は不図示のカメラや携帯端末、ファクシミリ（以下、ＦＡＸ）、画像形成装置など画像が取得できる媒体であればどの媒体から取得してもよい。本実施形態における劣化画像とは、影による劣化や光の反射、白飛び、書類の折れやたわみによる文字のゆがみ、画像解像度の低下などによる文字のかけ、にじみなどが原因となって文字の読み取りが困難な画像を示す。ステップＳ４０１では、こうした劣化に関する情報を含んだ画像を取得する。 In step S401, the CPU 201 of the server device 102 acquires a deteriorated image from the information processing device 101 by the function of the image acquisition unit 301. In the present embodiment, the deteriorated image is acquired from the information processing device 101, but the acquisition source is any medium such as a camera, a mobile terminal, a facsimile (hereinafter, FAX), or an image forming apparatus (not shown) that can acquire the image. You may. The deteriorated image in the present embodiment is the reading of characters due to deterioration due to shadows, light reflection, overexposure, distortion of characters due to folding or bending of documents, character sprinkling due to deterioration of image resolution, bleeding, and the like. Shows a difficult image. In step S401, an image including information on such deterioration is acquired.

ステップＳ４０２では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０１で取得した画像の劣化パターンをユーザの選択により受け付ける。ユーザが目視で判断した劣化パターンの傾向を、劣化パターン選択画面（不図示）に対してユーザの入力により受け付け、劣化パターンを選択すると良い。 In step S402, the CPU 201 of the server device 102 accepts the deterioration pattern of the image acquired in step S401 by the user's selection. It is preferable to accept the tendency of the deterioration pattern visually judged by the user on the deterioration pattern selection screen (not shown) by the user's input and select the deterioration pattern.

ステップＳ４０３では、サーバ装置１０２のＣＰＵ２０１は、劣化パターン分析部３０２の機能により画像の劣化の傾向を分析する。劣化傾向は、画像の輝度やコントラストなど画像から取得できる情報と、ステップＳ４０２で受け付けた劣化パターンとを用いて、劣化パターンごとに劣化の特徴を分析する。尚、劣化パターンについてはステップＳ４０２でユーザから劣化パターンを受け付けずとも、ステップＳ４０３の分析によりサーバ装置１０２で劣化パターンを特定してもよい。また、ステップＳ４０３では、画像の劣化傾向の分析の際に、画像を撮影した装置の機種の情報や解像度、及び画像サイズなどの情報を画像のプロパティから取得し分析に用いてもよい。分析された情報や取得された画像のプロパティに関する情報は、サーバ装置１０２の外部メモリに記憶される。 In step S403, the CPU 201 of the server device 102 analyzes the tendency of image deterioration by the function of the deterioration pattern analysis unit 302. The deterioration tendency is analyzed for each deterioration pattern by using the information that can be acquired from the image such as the brightness and contrast of the image and the deterioration pattern received in step S402. As for the deterioration pattern, the server device 102 may specify the deterioration pattern by the analysis in step S403 without accepting the deterioration pattern from the user in step S402. Further, in step S403, when analyzing the deterioration tendency of the image, information such as the model of the device that captured the image, the resolution, and the image size may be acquired from the property of the image and used for the analysis. The analyzed information and the information regarding the properties of the acquired image are stored in the external memory of the server device 102.

ステップＳ４０４では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０３で分析され取得された劣化画像の情報をもとに、劣化画像学習モデルを生成する。すなわち、ステップＳ４０１で取得した画像を学習モデルに学習用データとして学習させることで劣化画像学習モデルを生成する。劣化画像学習モデルは、ステップＳ４０１からステップＳ４０３の処理を繰り返すことで劣化パターンの情報を機械学習することが可能であり、この劣化画像学習モデルを用いて劣化画像を生成することが可能である。 In step S404, the CPU 201 of the server device 102 generates a degraded image learning model based on the information of the degraded image analyzed and acquired in step S403. That is, a degraded image learning model is generated by training the learning model with the image acquired in step S401 as learning data. The degraded image learning model can machine-learn the information of the degraded pattern by repeating the processes of steps S401 to S403, and can generate a degraded image using this degraded image learning model.

ステップＳ４０５では、サーバ装置１０２のＣＰＵ２０１は、スキャナー(不図示)や画像形成装置(不図示)でスキャンして読み込んだ画像を情報処理装置１０１から取得し、劣化画像学習モデルに投入（入力）する。この時読み込んだ画像は、ステップＳ４０１で取得した画像とは異なる画像である。 In step S405, the CPU 201 of the server device 102 acquires (inputs) an image scanned and read by a scanner (not shown) or an image forming device (not shown) from the information processing device 101 and inputs it to the deteriorated image learning model. .. The image read at this time is an image different from the image acquired in step S401.

ステップＳ４０６では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０５で投入された画像をもとに劣化画像を生成する。この処理は、つまり、ステップＳ４０５で劣化画像学習モデルに画像を入力することで新たな劣化画像を生成する処理を示す。 In step S406, the CPU 201 of the server device 102 generates a deteriorated image based on the image input in step S405. This process indicates, that is, a process of generating a new deteriorated image by inputting an image into the deteriorated image learning model in step S405.

ステップＳ４０７では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０５で取得した画像と、ステップＳ４０５で取得した画像を基にしてステップＳ４０６で生成した劣化画像とを比較する。尚、ステップＳ４０６で生成した劣化画像はあらかじめサーバ装置１０２で記憶している同一の画像があればその画像と比較してもよい。 In step S407, the CPU 201 of the server device 102 compares the image acquired in step S405 with the deteriorated image generated in step S406 based on the image acquired in step S405. The deteriorated image generated in step S406 may be compared with the same image previously stored in the server device 102 if there is one.

ステップＳ４０８では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０７で二つの画像を比較した結果、ステップＳ４０５で取得した画像と、ステップＳ４０６で生成した劣化画像とが類似しているか否かを判定する。劣化画像の類似の判定は、例えば画像の特徴量を用いて、従来技術の画像解析を用いて判定されても良いし、画像の類似が判定できればどの技術を用いてもよい。サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０５で取得した劣化画像とステップＳ４０６で生成した劣化画像とが類似していると判定した場合は、適切な劣化画像が生成できたと判断して一連の処理を終了する。サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０５で取得した画像とステップＳ４０６で生成した劣化画像とが類似していないと判定した場合はステップＳ４０１へ処理を戻す。このとき、サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０１ではなくステップＳ４０５に処理を戻してもよい。 In step S408, the CPU 201 of the server device 102 compares the two images in step S407, and as a result, determines whether or not the image acquired in step S405 and the deteriorated image generated in step S406 are similar. The determination of the similarity of the deteriorated image may be made by using the image analysis of the prior art, for example, using the feature amount of the image, or any technique may be used as long as the similarity of the images can be determined. When the CPU 201 of the server device 102 determines that the deteriorated image acquired in step S405 and the deteriorated image generated in step S406 are similar, it determines that an appropriate deteriorated image has been generated and ends a series of processes. do. If the CPU 201 of the server device 102 determines that the image acquired in step S405 and the deteriorated image generated in step S406 are not similar, the process returns to step S401. At this time, the CPU 201 of the server device 102 may return the process to step S405 instead of step S401.

次に、劣化画像生成処理について、図４のフローチャートのステップＳ４０９からステップＳ４１２の処理を用いて説明する。ステップＳ４０９からステップＳ４１２の処理は、ステップＳ４０１からステップＳ４０８の処理で生成された劣化画像学習モデルを用いて劣化画像を生成する処理である。本実施形態では、劣化画像学習モデルを用いて生成された劣化画像は、劣化画像学習モデルに学習させるための学習用データとして用いられ、さらに後述するＯＣＲモデルを調整するための学習用データとしても用いられる。尚、学習用データ以外の他の用途で劣化画像を生成するために劣化画像学習モデルを用いて劣化画像を生成してもよい。 Next, the deteriorated image generation process will be described using the processes of steps S409 to S412 in the flowchart of FIG. The process from step S409 to step S412 is a process for generating a degraded image using the degraded image learning model generated in the process from step S401 to step S408. In the present embodiment, the degraded image generated by using the degraded image learning model is used as learning data for training the degraded image learning model, and further as learning data for adjusting the OCR model described later. Used. It should be noted that a degraded image may be generated using a degraded image learning model in order to generate a degraded image for purposes other than learning data.

ステップＳ４０９では、サーバ装置１０２のＣＰＵ２０１は、画像取得部３０１の機能により、スキャンされた、もしくはカメラ(不図示)で撮影された画像を情報処理装置１０１から取得する。 In step S409, the CPU 201 of the server device 102 acquires an image scanned or captured by a camera (not shown) from the information processing device 101 by the function of the image acquisition unit 301.

ステップＳ４１０では、サーバ装置１０２のＣＰＵ２０１は、ユーザによりステップＳ４０９で取得した画像の劣化パターンの選択を受け付ける。例えば、ユーザの入力により、ユーザが目視で判断した劣化パターンの傾向を劣化パターン選択画面（不図示）に対して受け付けると良い。 In step S410, the CPU 201 of the server device 102 accepts the user to select the deterioration pattern of the image acquired in step S409. For example, it is preferable to accept the tendency of the deterioration pattern visually determined by the user on the deterioration pattern selection screen (not shown) by the input of the user.

ステップＳ４１１では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０９で取得した画像をステップＳ４０４で作成した劣化画像学習モデルに投入（入力）する。 In step S411, the CPU 201 of the server device 102 inputs (inputs) the image acquired in step S409 into the degraded image learning model created in step S404.

ステップＳ４１２では、サーバ装置１０２のＣＰＵ２０１は劣化画像生成部３０４の機能により、劣化画像学習モデルを用いて劣化画像を生成する。より具体的には、サーバ装置１０２のＣＰＵ２０１は、ステップＳ４１０で選択を受け付けた劣化パターンに基づいて、ステップＳ４０９で取得した画像をもとにした劣化画像を生成する。例えば、サーバ装置１０２のＣＰＵ２０１は、ステップＳ４０９で取得した画像の輝度やコントラストを、劣化パターンの情報と画像のプロパティ情報とに基づいて変更、調整して画像の劣化を再現する。そして、ステップＳ４１２で生成された劣化画像は、劣化画像学習モデルに学習用データとして投入される。この処理を繰り返すことで、劣化画像学習モデルは劣化画像に関する学習を繰り返し行う。 In step S412, the CPU 201 of the server device 102 generates a deteriorated image by using the deteriorated image learning model by the function of the deteriorated image generation unit 304. More specifically, the CPU 201 of the server device 102 generates a deteriorated image based on the image acquired in step S409 based on the deterioration pattern for which selection is accepted in step S410. For example, the CPU 201 of the server device 102 changes and adjusts the brightness and contrast of the image acquired in step S409 based on the deterioration pattern information and the property information of the image to reproduce the deterioration of the image. Then, the deteriorated image generated in step S412 is input to the deteriorated image learning model as learning data. By repeating this process, the degraded image learning model repeatedly learns about the degraded image.

このように、サーバ装置１０２は、ステップＳ４０９からステップＳ４１２の処理を繰り返すことで、複数種類の劣化画像を自動生成することが可能となる。また、劣化画像学習モデルでは、選択を受け付ける劣化パターンや取得するスキャン画像の種類及びプロパティ情報に基づいて、取得した画像に合わせた劣化の傾向を学習可能である。さらに、劣化画像学習モデルは、その学習した結果を用いて一つの画像から複数種類の劣化画像を生成することが可能となる。これにより、異なる種類の劣化を再現する多量の劣化画像をユーザの手間なく生成することが可能となる。 In this way, the server device 102 can automatically generate a plurality of types of deteriorated images by repeating the processes of steps S409 to S412. Further, in the deteriorated image learning model, it is possible to learn the tendency of deterioration according to the acquired image based on the deterioration pattern for accepting selection, the type of the scanned image to be acquired, and the property information. Further, the degraded image learning model can generate a plurality of types of degraded images from one image by using the learned result. This makes it possible to generate a large amount of deteriorated images that reproduce different types of deterioration without the user's trouble.

次に、図５のフローチャートを用いて、ＯＣＲモデル学習処理について説明を行う。本実施形態のＯＣＲモデルは、ＯＣＲエンジンと機械学習を組み合わせたもので、劣化パターンごとにＯＣＲ結果を学習してＯＣＲの認識精度を高めることができる。 Next, the OCR model learning process will be described using the flowchart of FIG. The OCR model of the present embodiment is a combination of an OCR engine and machine learning, and can learn OCR results for each deterioration pattern to improve OCR recognition accuracy.

ステップＳ５０１では、サーバ装置１０２のＣＰＵ２０１は、図４のステップＳ４１２で生成した劣化画像をＯＣＲモデルに投入する。 In step S501, the CPU 201 of the server device 102 inputs the deteriorated image generated in step S412 of FIG. 4 into the OCR model.

ステップＳ５０２では、サーバ装置１０２のＣＰＵ２０１は、劣化パターン分析部３０２の機能によりＯＣＲモデルに投入された劣化画像の劣化パターンを分析し、特定する。 In step S502, the CPU 201 of the server device 102 analyzes and identifies the deterioration pattern of the deteriorated image input to the OCR model by the function of the deterioration pattern analysis unit 302.

ステップＳ５０３では、サーバ装置１０２のＣＰＵ２０１は、機械学習部３０５の機能により、劣化画像を読み込んで劣化画像に含まれる文字列の認識やＯＣＲを行い、その結果をＯＣＲモデルに学習させる。ステップＳ５０２とステップＳ５０３の学習は、ＯＣＲモデルの認識精度を高めるために繰り返し行われる。 In step S503, the CPU 201 of the server device 102 reads the deteriorated image, recognizes the character string included in the deteriorated image, performs OCR, and causes the OCR model to learn the result by the function of the machine learning unit 305. The learning of steps S502 and S503 is repeated in order to improve the recognition accuracy of the OCR model.

ステップＳ５０４では、サーバ装置１０２のＣＰＵ２０１は、テスト用としてあらかじめ記憶しているテスト用劣化画像をＯＣＲモデルに投入する。 In step S504, the CPU 201 of the server device 102 inputs the deteriorated image for testing stored in advance for testing into the OCR model.

ステップＳ５０５では、サーバ装置１０２のＣＰＵ２０１は、ＯＣＲ処理部３０９の機能によりステップＳ５０４で投入したテスト用劣化画像のＯＣＲを行い、そのＯＣＲ結果が所定値以上であるか否かを判定する。サーバ装置１０２のＣＰＵ２０１は、ＯＣＲ結果が所定値以上であると判定した場合はステップＳ５０６に処理を進める。また、サーバ装置１０２のＣＰＵ２０１は、ＯＣＲ結果が所定値以上でないと判定した場合は、ＯＣＲの読み取り精度がまだ不十分であるため、ステップＳ５０１に処理を戻しＯＣＲモデルの学習を繰り返す。 In step S505, the CPU 201 of the server device 102 performs OCR of the deteriorated image for testing input in step S504 by the function of the OCR processing unit 309, and determines whether or not the OCR result is equal to or higher than a predetermined value. If the CPU 201 of the server device 102 determines that the OCR result is equal to or higher than a predetermined value, the process proceeds to step S506. If the CPU 201 of the server device 102 determines that the OCR result is not equal to or higher than a predetermined value, the OCR reading accuracy is still insufficient, so the process returns to step S501 and the learning of the OCR model is repeated.

ステップＳ５０６では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ５０４でＯＣＲを行った結果を出力する。このとき、複数のＯＣＲエンジンでＯＣＲを行っていた場合はＯＣＲ結果をマージして出力する。出力した結果はＯＣＲモデルに学習させてもよい。 In step S506, the CPU 201 of the server device 102 outputs the result of performing OCR in step S504. At this time, if OCR is performed by a plurality of OCR engines, the OCR results are merged and output. The output result may be trained by the OCR model.

以上で、図４と図５を用いた、劣化画像学習モデル生成処理と劣化画像生成処理、及びＯＣＲモデル学習処理についての説明を終了する。 This is the end of the description of the degraded image learning model generation process, the degraded image generation process, and the OCR model learning process using FIGS. 4 and 5.

上記の処理を行うことにより、少ない劣化画像から様々な劣化パターンの画像を複数作成することが可能なため、機械学習のための劣化画像を多量に用意するユーザの手間を低減させることが可能となる。また、生成した劣化画像を学習用データとして劣化画像学習モデルに学習させることで、劣化画像の劣化の再限度を高めることができる。さらに、生成した劣化画像を学習用データとしてＯＣＲモデルに学習させることで、ＯＣＲ処理の認識精度を高めることも可能となる。 By performing the above processing, it is possible to create multiple images with various deterioration patterns from a small number of deterioration images, so it is possible to reduce the time and effort of the user who prepares a large amount of deterioration images for machine learning. Become. Further, by training the generated deteriorated image as learning data in the deteriorated image learning model, it is possible to increase the re-limit of the deterioration of the deteriorated image. Further, by training the generated deteriorated image in the OCR model as learning data, it is possible to improve the recognition accuracy of the OCR process.

次に、図６から図７を用いて、複数のＯＣＲエンジンでＯＣＲを行う際に、ＯＣＲエンジンごとに適切な前処理を容易に行うための一連の流れについて説明する。複数のＯＣＲエンジンを用いて画像のＯＣＲ処理を行うためには、画像に応じて適切なＯＣＲエンジンに振り分けることが必要であり、さらに振り分けるＯＣＲエンジンに応じた前処理を画像に施す必要がある。そのため、本実施形態では、画像を適切なＯＣＲエンジンに振り分けつつ、振り分けたＯＣＲエンジンに応じた前処理を行い、ＯＣＲの認識精度を上げるための処理を行う。 Next, with reference to FIGS. 6 to 7, when performing OCR with a plurality of OCR engines, a series of flows for easily performing appropriate preprocessing for each OCR engine will be described. In order to perform OCR processing of an image using a plurality of OCR engines, it is necessary to distribute the image to an appropriate OCR engine according to the image, and further perform preprocessing according to the OCR engine to distribute the image. Therefore, in the present embodiment, while distributing the image to an appropriate OCR engine, preprocessing is performed according to the distributed OCR engine, and processing for improving the recognition accuracy of OCR is performed.

図６は、サーバ装置１０２で取得された画像に対して前処理を行い、ＯＣＲを行うまでの全体の流れを示すフローチャートである。 FIG. 6 is a flowchart showing the entire flow from performing preprocessing to the image acquired by the server device 102 to performing OCR.

ステップＳ６０１は、サーバ装置１０２のＣＰＵ２０１は、取得画像の分析を行う。取得画像の分析については、図７の投入画像の分析のフローチャートを用いて詳細を説明する。 In step S601, the CPU 201 of the server device 102 analyzes the acquired image. The analysis of the acquired image will be described in detail using the flowchart of the analysis of the input image of FIG. 7.

図７のステップＳ７０１では、サーバ装置１０２のＣＰＵ２０１は、画像を情報処理装置１０１から取得し、画像取得部３０１の機能により取得する。 In step S701 of FIG. 7, the CPU 201 of the server device 102 acquires an image from the information processing device 101 and acquires it by the function of the image acquisition unit 301.

ステップＳ７０２では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ７０１で取得した画像のプロパティの情報を特定する。例えば、画像を撮影、スキャン、もしくはＦＡＸした装置の機種や画像の解像度などの情報を特定する。 In step S702, the CPU 201 of the server device 102 identifies the property information of the image acquired in step S701. For example, information such as the model of the device that captured, scanned, or faxed the image and the resolution of the image is specified.

ステップＳ７０３では、サーバ装置１０２のＣＰＵ２０１は、画像解析部３０６の機能により画像から紙面部分を抽出し、紙面部分の解像度や傾きの情報を分析する。本実施形態における画像は文書や帳票の画像であるため、その紙面部分の情報を抽出して分析する。 In step S703, the CPU 201 of the server device 102 extracts the paper surface portion from the image by the function of the image analysis unit 306, and analyzes the resolution and inclination information of the paper surface portion. Since the image in the present embodiment is an image of a document or a form, the information on the paper surface thereof is extracted and analyzed.

ステップＳ７０４では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ７０３で抽出した紙面部分の明るさを分析する。明るさは画像の輝度やコントラストの値などの情報から分析を行う。 In step S704, the CPU 201 of the server device 102 analyzes the brightness of the paper surface portion extracted in step S703. Brightness is analyzed from information such as image brightness and contrast values.

ステップＳ７０５では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ７０２からステップＳ７０４で取得、特定、及び分析した情報に基づいて、ステップＳ７０１で取得した画像が、あらかじめ設定している画像グループに該当するか否かを判定する。サーバ装置１０２のＣＰＵ２０１は、ステップＳ７０１で取得した画像が画像グループに該当すると判定した場合はステップＳ７０６に処理を進める、画像グループに該当しないと判定した場合はステップＳ７０７に処理を進める。 In step S705, the CPU 201 of the server device 102 determines whether or not the image acquired in step S701 corresponds to the preset image group based on the information acquired, specified, and analyzed in steps S702 from step S702. To judge. If the CPU 201 of the server device 102 determines that the image acquired in step S701 corresponds to the image group, the process proceeds to step S706, and if it determines that the image does not correspond to the image group, the process proceeds to step S707.

ここで、図８を用いて、画像のグループとグループごとの前処理パターンについて説明する。図８のグループ別前処理パターン８００は画像のグループとグループごとの前処理パターンの対応関係の一例を表した図である。さらに、グループ別前処理パターン８００は、画像の特徴と前処理パターンとの対応関係の一例を示した図でもある。画像グループは、劣化パターンの特徴ごとにグループが異なり、例えば、ノイズのある画像はグループＡが当てはまり、全体的に暗い（所定値以下の明るさ）の画像はグループＢが当てはまる。また、画像の明るさや傾きなど全体的な数値が所定値以上であれば良好な画像としてグループＣが当てはまり、画像にノイズが含まれた上で暗い画像はグループＤに当てはまる。 Here, with reference to FIG. 8, a group of images and a preprocessing pattern for each group will be described. The preprocessing pattern 800 for each group in FIG. 8 is a diagram showing an example of the correspondence between the group of images and the preprocessing pattern for each group. Further, the pre-processing pattern 800 for each group is also a diagram showing an example of the correspondence between the characteristics of the image and the pre-processing pattern. The image group differs depending on the characteristics of the deterioration pattern. For example, group A applies to an image with noise, and group B applies to an image that is generally dark (brightness of a predetermined value or less). Further, if the overall numerical values such as the brightness and the inclination of the image are equal to or more than a predetermined value, the group C is applicable as a good image, and the dark image is applicable to the group D after the image contains noise.

さらに、グループ別前処理パターン８００はグループごとに画像に適用すべき前処理パターンが定義されている。グループにどの前処理を行うかについては、図９の前処理設定画面９００で入力を受け付けた前処理をグループごとに設定する。前処理設定画面９００では、チェックボックスに選択を受け付けることにより必要な前処理の設定が可能であり、二値化処理については二値化の数値を細かく設定することによって二値化の度合いを二値化パターンごとに指定することが可能となる。 Further, in the preprocessing pattern 800 for each group, a preprocessing pattern to be applied to the image is defined for each group. As for which pre-processing is to be performed on the group, the pre-processing for which the input is accepted on the pre-processing setting screen 900 of FIG. 9 is set for each group. On the pre-processing setting screen 900, it is possible to set the necessary pre-processing by accepting the selection in the check box, and for the binarization process, the degree of binarization is set by setting the binarization value in detail. It is possible to specify for each binarization pattern.

図８に説明を戻す。例えば、図８のグループＡは、前処理パターンａが適用される。前処理パターンａでは、二値化パターンＡ、黒点ノイズ除去、及び方向補正の前処理と、二値化パターンＤ、縦線ノイズ除去の前処理が設定されているため、この組み合わせの前処理が画像に対して適用される。 The explanation is returned to FIG. For example, the preprocessing pattern a is applied to the group A in FIG. In the pre-processing pattern a, the pre-processing of the binarization pattern A, the black spot noise removal and the direction correction, and the pre-processing of the binarization pattern D and the vertical line noise removal are set, so that the pre-processing of this combination is performed. Applies to images.

図７に説明を戻す。ステップＳ７０６では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ７０５で取得された画像が既存の画像グループに該当すると判定した場合に、当てはめる画像グループを確定させる。例えば、ステップＳ７０５で取得された画像を図８のグループＡに確定させる。 The explanation is returned to FIG. In step S706, the CPU 201 of the server device 102 determines the image group to be applied when it is determined that the image acquired in step S705 corresponds to the existing image group. For example, the image acquired in step S705 is confirmed in the group A of FIG.

ステップＳ７０７は、サーバ装置１０２のＣＰＵ２０１は、ステップＳ７０５で画像が既存の画像グループに該当しないと判定した場合に、画像をその他グループとして確定させる。その他グループは前処理パターンが設定されていないグループを示す。 In step S707, when the CPU 201 of the server device 102 determines in step S705 that the image does not correspond to the existing image group, the image is determined as the other group. Other groups indicate groups for which no preprocessing pattern is set.

以上で、図７の説明を終了する。この処理により、取得した画像の劣化や種別に応じて画像の傾向ごとにグループ分けすることができ、後述する前処理の工程で適切な前処理を画像に対して行うことが可能となる。 This is the end of the description of FIG. By this processing, it is possible to group the acquired images according to the tendency of the images according to the deterioration and the type of the acquired images, and it is possible to perform appropriate preprocessing on the images in the preprocessing step described later.

ここで、図６に説明を戻す。ステップＳ６０２では、サーバ装置１０２のＣＰＵ２０１は、画像グループごとの前処理を実施する。詳細については図７のステップＳ７０８からステップＳ７１２を用いて説明する。 Here, the explanation is returned to FIG. In step S602, the CPU 201 of the server device 102 performs preprocessing for each image group. Details will be described with reference to steps S708 to S712 of FIG.

図７のステップＳ７０８では、サーバ装置１０２のＣＰＵ２０１は、画像グループが確定しているか否かを判定する。サーバ装置１０２のＣＰＵ２０１は、画像グループが確定していると判定した場合はステップＳ７０９に処理を進め、画像グループが確定していない、つまりステップＳ７０７でその他グループとして確定されていた場合はステップＳ７１０に処理を進める。 In step S708 of FIG. 7, the CPU 201 of the server device 102 determines whether or not the image group is fixed. If the CPU 201 of the server device 102 determines that the image group has been determined, the process proceeds to step S709, and if the image group has not been determined, that is, if the image group has been determined as another group in step S707, the process proceeds to step S710. Proceed with processing.

ステップＳ７０９では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ７０１で取得した画像に対して確定した画像グループに設定されている前処理（加工処理）を実施する。このとき、１つの画像に対して１つの前処理パターン（加工パターン）を実施する。例えば、画像がグループＡに確定されていた場合、グループＡには前処理パターンが２つ設定されているため、画像を複製して一方の画像に対しては図８の前処理パターンａに設定されている前処理を実施する。そして複製したもう一方の画像に対しては前処理パターンｄに設定されている前処理を実施し、それぞれの前処理画像を生成する。 In step S709, the CPU 201 of the server device 102 performs the preprocessing (processing process) set in the image group determined for the image acquired in step S701. At this time, one preprocessing pattern (processing pattern) is executed for one image. For example, when an image is confirmed in group A, since two preprocessing patterns are set in group A, the image is duplicated and one of the images is set in the preprocessing pattern a in FIG. Perform the pretreatment that has been done. Then, the preprocessing set in the preprocessing pattern d is performed on the other duplicated image, and each preprocessed image is generated.

ステップＳ７１０では、サーバ装置１０２のＣＰＵ２０１は、前処理画像を生成する。より具体的には、サーバ装置１０２のＣＰＵ２０１がステップＳ７０１で取得した画像を設定されている前処理のパターン分複製し、複製した画像それぞれに対して設定されている前処理パターンをすべて実施する（加工処理を行う）。このとき、１つの画像に対して１つの前処理パターンが実施され、設定されているすべての前処理パターン分の前処理画像が生成される。サーバ装置１０２のＣＰＵ２０１は、設定されている前処理パターンのすべての前処理画像を生成するまでステップＳ７１０の処理を繰り返す。 In step S710, the CPU 201 of the server device 102 generates a preprocessed image. More specifically, the CPU 201 of the server device 102 duplicates the image acquired in step S701 for the set preprocessing pattern, and executes all the preprocessing patterns set for each of the duplicated images (). Perform processing). At this time, one preprocessing pattern is executed for one image, and preprocessing images for all the set preprocessing patterns are generated. The CPU 201 of the server device 102 repeats the process of step S710 until all the preprocessed images of the set preprocessed patterns are generated.

ステップＳ７１１では、サーバ装置１０２のＣＰＵ２０１は、ＯＣＲ処理部３０９の機能により、生成した前処理画像に対して複数のＯＣＲエンジンでＯＣＲを実行する。前処理パターンはＯＣＲエンジンごとに適切な前処理として設定されているため、実施した前処理に対応するＯＣＲエンジンに前処理画像を振り分けてＯＣＲを行う。 In step S711, the CPU 201 of the server device 102 executes OCR with a plurality of OCR engines on the generated preprocessed image by the function of the OCR processing unit 309. Since the pre-processing pattern is set as an appropriate pre-processing for each OCR engine, the pre-processing image is distributed to the OCR engine corresponding to the executed pre-processing and OCR is performed.

ステップＳ７１１では、サーバ装置１０２のＣＰＵ２０１は、ステップＳ７１１で実施したＯＣＲの結果を出力する。 In step S711, the CPU 201 of the server device 102 outputs the result of the OCR performed in step S711.

以上の処理により、図７の画像グループごとの前処理についての説明を終了する。これにより、画像グループが確定した画像に対しては必要な前処理を行うことができ、適切なＯＣＲエンジンでＯＣＲ処理を行うことが可能となる。通常であれば、最適な前処理を行うためには、設定している全前処理パターン分の前処理画像を生成し、すべてのＯＣＲエンジンでＯＣＲを行う。しかしながら、そのやり方では前処理画像を多量に生成する必要があり、多くの前処理を実行しなければならないためサーバ装置１０２にとって負荷がかかってしまう。したがって、本発明のような処理を行うことにより、必要最低限かつ適切な前処理を行い、最適なＯＣＲエンジンでＯＣＲを実行することが可能となる。 With the above processing, the description of the preprocessing for each image group in FIG. 7 ends. As a result, necessary preprocessing can be performed on the image for which the image group is confirmed, and OCR processing can be performed by an appropriate OCR engine. Normally, in order to perform optimum preprocessing, preprocessed images for all the set preprocessing patterns are generated, and OCR is performed by all OCR engines. However, in that method, it is necessary to generate a large amount of preprocessed images, and a large amount of preprocessing must be executed, which puts a load on the server device 102. Therefore, by performing the processing as in the present invention, it is possible to perform the minimum necessary and appropriate preprocessing and execute OCR with the optimum OCR engine.

図６に説明を戻す。図６では、ステップＳ６０２で行った前処理の前処理画像に対してＯＣＲを行った結果を出力する。より具体的には、図７のステップＳ７１３からステップＳ７１７の処理を用いて説明する。 The explanation is returned to FIG. In FIG. 6, the result of performing OCR on the preprocessed image of the preprocessing performed in step S602 is output. More specifically, the process of steps S713 to S717 of FIG. 7 will be used for description.

図７のステップＳ７１３では、サーバ装置１０２のＣＰＵ２０１は、ＯＣＲを行った帳票の画像における帳票の項目ごとに、複数のＯＣＲエンジンで行ったＯＣＲ結果を比較し、項目の値が正規表現に合致しているか否かを判定する。 In step S713 of FIG. 7, the CPU 201 of the server device 102 compares the OCR results performed by a plurality of OCR engines for each item of the form in the image of the form subjected to OCR, and the value of the item matches the regular expression. Determine if it is.

ステップＳ７１４では、サーバ装置１０２のＣＰＵ２０１は、ＯＣＲを行った帳票の画像における帳票の項目ごとに、正しい値の範囲内か否かを判定する。 In step S714, the CPU 201 of the server device 102 determines whether or not it is within the correct value range for each item of the form in the image of the form for which OCR has been performed.

ステップＳ７１５では、サーバ装置１０２のＣＰＵ２０１は、ＯＣＲを行った帳票の画像における帳票の項目ごとに、確信度が基準値以上か否かを判定する。 In step S715, the CPU 201 of the server device 102 determines whether or not the certainty level is equal to or higher than the reference value for each item of the form in the image of the form subjected to OCR.

サーバ装置１０２のＣＰＵ２０１は、ステップＳ７１３からステップＳ７１５の処理を帳票の項目ごとＯＣＲで出力された結果の分だけ繰り返す。 The CPU 201 of the server device 102 repeats the processes of steps S713 to S715 for each item of the form as much as the result output by OCR.

ステップＳ７１６では、サーバ装置１０２のＣＰＵ２０１は、複数のＯＣＲエンジンを用いてＯＣＲを行った結果と、ステップＳ７１３からステップＳ７１５の処理における判定結果とを用いて、最良と判断した項目を抽出する。そして、サーバ装置１０２のＣＰＵ２０１は、抽出した項目を用いて最終的なＯＣＲ結果を項目ごとに決定（採用）する。 In step S716, the CPU 201 of the server device 102 extracts the item determined to be the best by using the result of performing OCR using a plurality of OCR engines and the determination result in the processing of steps S713 to S715. Then, the CPU 201 of the server device 102 determines (adopts) the final OCR result for each item using the extracted items.

ステップＳ７１７では、サーバ装置１０２のＣＰＵ２０１は、複数のＯＣＲエンジンを用いてＯＣＲを行った結果からステップＳ７１６で抽出したそれぞれの項目をマージし、最終的なＯＣＲ結果を出力する。 In step S717, the CPU 201 of the server device 102 merges the items extracted in step S716 from the results of performing OCR using a plurality of OCR engines, and outputs the final OCR result.

以上により、図７の説明を終了する。これにより、複数のＯＣＲエンジンを用いた場合にＯＣＲエンジンごとに最適な前処理をおこない、複数のＯＣＲエンジンを用いたＯＣＲ結果の中から最終的に最良の結果をマージすることでＯＣＲの認識精度を上げることが可能となる。 This completes the description of FIG. 7. As a result, when multiple OCR engines are used, optimum preprocessing is performed for each OCR engine, and the best OCR results are finally merged from the OCR results using multiple OCR engines to achieve OCR recognition accuracy. It becomes possible to raise.

尚、本実施形態では、すべての処理をサーバ装置１０２で行ったが、処理を行う装置は処理ごとに分けて複数の装置で各処理を実行してもよい。 In the present embodiment, all the processes are performed by the server device 102, but the device that performs the processes may be divided for each process and each process may be executed by a plurality of devices.

以上、本発明によれば、劣化傾向の分析を行うために必要な画像を容易に生成可能な仕組みを提供することができる。 As described above, according to the present invention, it is possible to provide a mechanism capable of easily generating an image necessary for analyzing a deterioration tendency.

なお、上述した各種データの構成及びその内容はこれに限定されるものではなく、用途や目的に応じて、様々な構成や内容で構成されことは言うまでもない。 It is needless to say that the structure and contents of the various data described above are not limited to this, and are composed of various structures and contents depending on the intended use and purpose.

以上、本発明の実施形態を詳述したが、本発明は、例えば、システム、装置、方法、プログラムもしくは記憶媒体等としての実施態様をとることが可能であり、具体的には、複数の機器から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。 Although the embodiments of the present invention have been described in detail above, the present invention can be, for example, an embodiment as a system, an apparatus, a method, a program, a storage medium, or the like, and specifically, a plurality of devices. It may be applied to a system composed of, or it may be applied to a device consisting of one device.

また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読み出し実行することによっても、達成されることは言うまでもない。 Further, an object of the present invention is to supply a storage medium in which a program code of software that realizes the functions of the above-described embodiment is recorded to a system or an apparatus, and a computer (or a CPU or MPU) of the system or the apparatus is a storage medium. Needless to say, it can also be achieved by reading and executing the program code stored in.

この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、プログラムコード自体及びそのプログラムコードを記憶した記憶媒体は本発明を構成することになる。 In this case, the program code itself read from the storage medium realizes the function of the above-described embodiment, and the program code itself and the storage medium storing the program code constitute the present invention.

プログラムコードを供給するための記憶媒体としては、例えば、フレキシブルディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ－ＲＯＭ、ＣＤ－Ｒ、磁気テープ、不揮発性のメモリカード、ＲＯＭ等を用いることができる。 As the storage medium for supplying the program code, for example, a flexible disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, a non-volatile memory card, a ROM, or the like can be used.

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼動しているＯＳ（基本システム或いはオペレーティングシステム）などが実際の処理の一部又は全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, not only the function of the above-described embodiment is realized, but also the OS (basic system or operating system) running on the computer is realized based on the instruction of the program code. ) And the like perform a part or all of the actual processing, and it goes without saying that the processing may realize the function of the above-described embodiment.

さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵ等が実際の処理の一部又は全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, after the program code read from the storage medium is written in the memory provided in the function expansion board inserted in the computer or the function expansion unit connected to the computer, the function is based on the instruction of the program code. Needless to say, there are cases where the CPU provided in the expansion board or the function expansion unit performs a part or all of the actual processing, and the processing realizes the functions of the above-described embodiment.

なお、上述した各実施形態及びその変形例を組み合わせた構成も全て本発明に含まれるものである。 It should be noted that the present invention also includes all the configurations in which each of the above-described embodiments and modifications thereof are combined.

１３０ＬＡＮ
１０１情報処理装置
１０２サーバ装置
２０１ＣＰＵ
２０２ＲＯＭ
２０３ＲＡＭ
２１１外部メモリ
２１２ＧＰＵ
130 LAN
101 Information processing device 102 Server device 201 CPU
202 ROM
203 RAM
211 External memory 212 GPU

Claims

An acquisition method for acquiring a first image containing information on image deterioration,
A learning means that causes a learning model to learn the first image acquired by the acquisition means as learning data.
A third image that is different from the first image and that uses the second image and the result of learning by the learning model by inputting the scanned second image into the learning model. The generation method to generate the image and
An information processing system characterized by being equipped with.

The information processing system according to claim 1, wherein the learning means trains the learning model of the third image generated by the generation means as learning data.

The learning means causes the learning model to learn information about the deterioration of the first image for each deterioration pattern.
The information processing system according to claim 1 or 2, wherein the generation means generates the third image for each deterioration pattern.

The generation means is characterized in that the third image is generated by applying the information on the deterioration trained by the learning means to the second image input to the learning model. Item 1. The information processing system according to Item 1.

The information according to claim 4, wherein the generation means generates the third image by changing or adjusting at least one of the brightness and the contrast of the image based on the information regarding the deterioration of the image. Processing system.

A reception means that accepts the selection of which deterioration pattern is among a plurality of deterioration patterns,
Further prepare
The generation means is characterized in that, by inputting the second image into the learning model, the third image based on the deterioration pattern whose selection is accepted by the reception means is generated. The information processing system according to any one of 4.

The information processing according to claim 6, wherein the plurality of deterioration patterns include at least one of deterioration due to shadow, deterioration due to light reflection, deterioration due to overexposure, and deterioration due to character distortion. system.

The information processing system according to any one of claims 1 to 7 , wherein the first image is a photographed or scanned image.

The information processing system according to claim 8, wherein the first image is an image in which characters are difficult to read.

The first image is a deteriorated image containing at least one of deterioration due to shadow, reflection of light, blown-out highlights, distortion of characters due to folding or bending of a document, sprinkling of characters due to a decrease in image resolution, and bleeding. The information processing system according to claim 8 or 9.

OCR processing means that performs OCR processing on an image,
Further prepare
The information processing system according to any one of claims 1 to 10, wherein the OCR processing means performs OCR processing of the third image generated by the generation means.

It is a control method for information processing systems.
The acquisition step to acquire the first image, which contains information about image degradation,
A learning step in which the learning model learns the first image acquired in the acquisition step as learning data, and
A third image that is different from the first image and that uses the second image and the result of learning by the learning model by inputting the scanned second image into the learning model. The generation step to generate the image and
A method of controlling an information processing system, which comprises.

It is a program to function as an information processing system.
The information processing system
An acquisition method for acquiring a first image containing information on image deterioration,
A learning means that causes a learning model to learn the first image acquired by the acquisition means as learning data.
A third image that is different from the first image and that uses the second image and the result of learning by the learning model by inputting the scanned second image into the learning model. Generating means to generate images,
A program to function as.