JP3600364B2

JP3600364B2 - Character extraction method and apparatus

Info

Publication number: JP3600364B2
Application number: JP11262796A
Authority: JP
Inventors: 秀明山形
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1996-05-07
Filing date: 1996-05-07
Publication date: 2004-12-15
Anticipated expiration: 2016-05-07
Also published as: JPH09297816A

Description

【０００１】
【発明の属する分野】
本発明は、文字認識装置の文字切り出し技術に関する。
【０００２】
【従来の技術】
ＯＣＲにおいては、画像の潰れ／かすれによる文字画像の分離／接続のため、文字切り出しが正常に働かないことがある。文字切り出しが正常でない場合、認識結果を誤るだけでなく、認識結果の修正作業が煩雑になり、ユーザーの負担の増加が著しい。
【０００３】
文字画像を正しく切り出すためには、そのような分離／接続している文字画像の統合／分割を行う必要があるが、従来は、特開平１−１１４９９２号公報に見られるように、画像の大きさに基づいて、画像の統合／分離の判断をする方法が採用されることが多かった。
【０００４】
【発明が解決しようとする課題】
しかしながら、プロポーショナル印字された英文書や、英文／英単語の混在した日本語文書等では、個々の画像の大きさだけでは、その画像の分離／接続の判断が困難であり文字切り出し性能の向上を期待できない。
【０００５】
本発明の目的は、そのような文書に対する文字切り出し精度を向上させることが可能な文字切り出し方法及び装置を提供することにある。
【０００６】
【課題を解決するための手段】
本発明は、文字認識の前処理として、入力画像中の画像が単独文字画像（単独で１文字を構成する画像）であるか否かの判断を行う。このような前処理を行えば、その後の文字切り出しにおいて、とり得るパスが減るため処理負担が軽減するとともに、画像の統合／分離の誤りも生じにくくなり文字切り出しの精度が向上する。
【０００７】
しかし、個々の画像の大きさに着目する方法によっては、プロポーショナル印字の英文書や英文／英単語の混在した日本語文書等においては、そのような単独文字画像の判断を精度よく行うことは期待できない。
【０００８】
そのような文書においても精度よく単独文字画像であるか否かを判断するため、本発明は画像の並びに着目した知識を利用する。この知識とは「入力画像中に同一形状の画像Ａ１，Ａ２，．．．が複数存在し、その各画像Ａ１，Ａ２，．．．が単独文字画像であるならば、Ａ１，Ａ２，．．．の前後に様々な異なった画像が来る可能性が高い。他方、その画像Ａ１，Ａ２，．．．が文字画像の分離画像であれば、それぞれの前後の同じ位置に残りの部分である同一形状の画像ａ１，ａ２，．．．が来るであろう。そして、このことは、プロポーショナル印字の英文書や英文／英単語の混在した日本語文書等においても適用できる。」というものである。しかして、プロポーショナル印字の英文書や英文／英単語の混在した日本語文書等に対しても、前処理において単独文字画像か否かの判断を精度よく行うことができ、したがって、文字切り出しの精度も向上する。
【０００９】
【発明の実施の形態】
本発明の実施の形態を明らかにするため、図面を参照し本発明の実施例について説明する。
【００１０】
図１は、本発明の一実施例である文字認識システムのブロック図である。このブロック図は、説明の便宜のため処理の流れに対応付けて簡略化されており、実際のハードウエア構成を忠実に表すものではない。実際的には、この文字認識システムは、画像その他のデータを格納するための１つ又は複数のメモリ、処理のためのプログラムのような固定情報を格納する１つ又は複数のメモリ、プログラムに従って必要な処理及び制御を実行するＣＰＵ、外部とのデータ入出力のための入出力機器とそのインターフォース等により構成される。
【００１１】
図１において、画像入力部５０はイメージスキャナ等により文書の２値画像データを入力し、図示しないメモリに格納する。行切り出し部５１では、入力された文書画像より黒ラン連結成分の外接矩形を抽出し、外接矩形の文字行方向の接続関係を調べることによって文字行の範囲を切り出す。以下の処理において「画像」とは、この外接矩形の内部の画像を意味する。
【００１２】
文字切り出し前処理部５２は、本発明の要旨に関わる部分であり、行切り出し段階で切り出された画像の中から、単独で１つの文字画像であると判断できる画像（単独文字画像）を抽出する。この処理は、本実施例では１行単位で行われるが、処理単位は複数行、ページ等、基本的に任意でよい。ただし、処理単位が大きくなるほど処理の負担は大きくなる。この文字切り出し前処理部５２は処理内容の観点から同一画像グルーピング処理部５２Ａ、グループ間パラメータ算出部５２Ｂ及び単独文字画像抽出部５２Ｃに分けることができるが、それぞれの部分の処理内容は後述する。
【００１３】
文字切り出し部５３では、行切り出し部により切り出された画像より、個々の文字の画像を切り出すが、この際に、文字切り出し前処理部５２で単独文字画像として抽出された画像はそのまま文字画像として切り出し、統合／分離の対象から外す。このように、文字切り出しの前処理で単独文字画像と判断可能な画像を抽出しておくと、全ての画像を文字候補として文字切り出しを行う場合に比べ、文字切り出しの処理量が減少するとともに、不適切な切り出しが起こりにくくなる。
【００１４】
文字認識部５４では切り出された各文字画像に対する文字認識を行う。この文字認識の結果はメモリに一時的に蓄積された後、あるいは、そのまま、出力部５５により外部に出力される。なお、文字認識の結果を文字切り出し部５３にフィードバックし、文字切り出しを修正する方法が採用されてもよい。
【００１５】
以下、本発明の要旨に関わる部分である文字切り出し前処理部５２の処理内容について詳細に説明する。図２は、文字切り出し前処理部５２の処理の一例を示すフローチャートである。図３は図２中の最初のステップＳ１００の処理、すなわち文字切り出し前処理部５２内の同一画像グルーピング部５２Ａの処理の一例を示すフローチャートである。
【００１６】
まず、同一画像グルーピング処理について説明する。この処理は、行切り出しの際に抽出された画像の中で、形状が同一の画像と見なせる画像をグループ化することを目的としている。
【００１７】
行中の画像の中から同一か否か比較する画像の組を選び（ステップＳ２０１）、その画像の高さの差又は幅の差のいずれかでも所定の閾値ＷＤ＿ｔｈより大きいか調べる（ステップＳ２０２）。高さの差又は幅の差の一方でもＷＤ＿ｔｈより大きければ、その２つの画像は非同一と判断する（ステップＳ２０７）。ステップＳ２０２において高さの差も幅の差もＷＤ＿ｔｈ以下ならば、それぞれの画像の重心を求め、画像の左上角を基準とした重心の位置の差が所定の閾値ＳＴ＿ｔｈより大きいか調べ（ステップＳ２０３）、大きいならば、その２つの画像の非同一と判断する（ステップＳ２０７）。
【００１８】
ステップＳ２０３で重心位置の差がＳＴ＿ｔｈ以下ならば、その２つの画像の重心位置を合わせて排他的論理和（ＥＯＲ）画像を作成する（ステップＳ２０４）。このＥＯＲ画像に対して２×２画素のマスク処理を行い、２×２画素の領域内に黒画素（２つの画像の画素値が異なる画素）が３個以上検出されるか調べ
（ステップＳ２０５）、３個以上検出されたならば、２つの画像を非同一と判断する（ステップＳ２０７）。ステップＳ２０５において、全ての２×２画素領域を処理しても黒画素が３個以上検出されなかったときには、その２つの画像を同一と判断する（ステップＳ２０６）。
【００１９】
１組の画像の同一又は非同一を判断した後、まだ判断が済んでいない画像の組が残っているならば、ステップＳ２０８からステップＳ２０１へ戻り、残っている別の画像の組について同様の同一性判定の処理を行う。
【００２０】
全ての画像の組について同一／非同一を判断したならば、同一と判断された画像のグループにグループ固有のグループ番号を割り当てる（ステップＳ２０９）。この際、画像Ａと画像Ｂが同一と判断され、画像Ｂと画像Ｃが同一と判断された場合、仮に画像Ａと画像Ｃとが非同一と判断されたとしても、画像Ａ，Ｂ，Ｃを同じグループの画像として同じグループ番号を割り当てる。
【００２１】
最後に、各グループについて、グループ内の画像の大きさを調べ、グループ内に１つでも大きさが異常な画像がみつかったときには、そのグループを異常画像グループとしマークを付ける（ステップＳ２１０）。異常な大きさとは、文字認識の対象文字となり得ないほど大きいか又は小さいということである。具体的には、画像の大きさの上限をＳmax，下限をＳmin、画像の幅をＲＷ，画像の高さをＲＨとしたとき、次の条件のいずれかを満たす画像を異常画像とする。
条件１：ＲＷ＞Ｓmax 又はＲＨ＞Ｓmax
条件２：ＲＷ＜Ｓmin かつＲＨ＜Ｓmin
【００２２】
図２のフローチャートに戻って、ステップＳ１０１以下の処理を説明するが、その前に、単独文字画像の抽出処理の概要を説明する。
【００２３】
基本的には、ある同一画像グループ（注目グループ、異常画像グループは除く）内の画像に注目し、それが他の同一画像グループ（参照グループ、異常画像グループの場合もある）内の画像と統合される可能性があるか判断する。すなわち、分離文字画像である可能性、つまり単独文字画像でない可能性を判断する。
【００２４】
注目グループＧａにｎ個の画像Ａ１，Ａ２，．．．，Ａｎがあり、参照グループＧｂにｍ個の画像Ｂ１，Ｂ２，．．．，Ｂｍがあり、それぞれのグループ内の画像間で、例えば次のような位置関係が同じ画像の組があるとする。
Ｃ１＝（Ａ２，Ｂ１）
Ｃ２＝（Ａ３，Ｂ３）
．．．
ＣN ＝（Ａn＿１，Ｂm）
これらの画像の組について、後述のグループ間パラメータを用いて統合の可能性を判断する。全てのグループの組合せについて同様の判断を行って、グループＧａの画像について例えば表１に示すような結果が得られたとする。
【００２５】
【表１】

【００２６】
この表１は、グループＧａの画像Ａ１〜Ａ８の中で、Ａ１はグループＧｂの画像と統合可能性があり、Ａ２はグループＧｃの画像と統合可能性があり、Ａ３はグループＧｃとグループＧｄの画像と統合可能性があり、Ａ４はグループＧｂの画像と統合可能性があり、．．．、Ａ６はどのグループの画像とも統合可能性がなく、．．．．、Ａ８はグループＧｄの画像と統合可能性がある、ことを表している。すなわち、この例では、グループＧａ内の８個の画像のうち、７個の画像は少なくとも一つの他のグループの画像と統合可能性がある。この統合可能性のある画像の割合をＡｌとし、これが予め決めたしきい値Ａｌ＿ｔｈより小さい場合に、グループＧａの画像を単独文字画像と判断する。つまり、「グループＧａの画像が単独文字画像ならば、それが部分文字画像の場合に比べ、各画像の前後に色々な文字画像が来るはずで割合Ａｌは小さいはずだ」という知識を利用する。この例では、Ａｌ＝７／８であり、Ａｌ＿ｔｈ＝０．８とすると、グループ
Ｇａの画像は単独文字画像ではないと判断されることになる。
【００２７】
図２に戻る。ステップＳ１０１において、カウンタＣＴを０に初期化する。このカウンタＣＴは、処理ループ内で単独文字画像が抽出されたときにステップＳ１１１でインクリメントされるカウンタである。新たな単独文字画像が抽出されなくなると、カウンタＣＴは０のままであるためステップＳ１１３で終了と判断され、処理を終了する。換言すれば、単独文字画像が抽出されなくなるまで処理を繰り返して行う。
【００２８】
ステップＳ１０２において、同一画像グルーピング処理によって得られた同一画像グループの中から、異常画像グループでなく、また、既に単独文字画像として抽出されたグループでない一つのグループを、注目グループとして選ぶ。
【００２９】
次のステップＳ１０３において、同一画像グループ（異常画像グループも含む）の中から、既に単独文字画像として抽出されたグループでも現在の参照グループでもない一つの同一画像グループを参照グループとして選ぶ。
【００３０】
ステップＳ１０４において、注目グループ内の画像（注目画像）と参照グループ内の画像（参照画像）について、位置関係の同じ注目画像と参照画像の組を収集し、位置関係の同じ注目画像と参照画像の組に関して下記のグループ間パラメータＤ，Ｎ，ｎ，ｍ，Ｐｓ，Ｐｖを算出する（グループ間パラメータ算出部５２Ｂの処理）。この際、例えば、重心の相対的な位置関係が、縦、横共に±１画素の範囲内にあるときに、それら画像は同じ位置関係にあると判断する。例えば、注目画像Ａ２の重心の３画素右、１１画素上に参照画像Ｂ２の重心があり、注目画像Ａ３の重心の２画素右、１１画素上に参照画像Ｂ３の重心がある場合、これら２つの画像の組は位置関係が同じと判断する。
【００３１】
１）画像間距離；Ｄ
注目画像と参照画像の重心の距離である。
【００３２】
２）参照画像同一位置存在数；Ｎ
注目画像から見て、参照画像が同じ位置関係にある数である。図４において、具体的には説明する。
ある行から、同一画像グルーピング処理によって、図４の（Ａ）に示すようなグループ番号１〜グループ番号７までの７グループが抽出されたとする。グループ番号２を注目グループ、グループ番号４を参照グループとしたときに、注目画像と参照画像が同じ位置関係になるのは図４（Ｂ）に示すように２通りである。つまり、Ｎ＝２である。また、グループ番号５を注目グループ、グループ番号２を参照グループとしたときは、図４（Ｃ）に示すようにＮ＝１である。
【００３３】
３）注目グループ内画像数；ｎ
注目グループに選ばれた同一画像グループに含まれる画像数である。これは同一画像グルーピング処理の段階で予め求めておいてもよい。図４（Ａ）に示す例で、グループ番号２を注目グループとしたときは、ｎ＝３である。
【００３４】
４）参照グループ内画像数；ｍ
参照グループに選ばれた同一画像グループに含まれる画像数であり、同一画像グルーピング処理の段階で予め求めておいてもよい。図４（Ａ）の例において、グループ番号４を参照グループとしたときは、ｍ＝２である。なお、参照グループが異常画像グループであるときには、ｍ＞ｍ＿ｔｈを満たす値をｍに設定する（ｍ＿ｔｈは予め定められた一定値）。
【００３５】
５）参照画像存在確率；Ｐｓ
注目画像から見て、参照画像が同じ位置関係にある確率であり、具体的には次式で計算される。
Ｐｓ＝ｍａｘ（Ｎ／ｍ，Ｎ／ｎ）
図４（Ａ）の例において、グループ番号２を注目グループ、グループ番号４を参照グループとした場合、Ｐｓ＝ｍａｘ（２／２，２／３）＝２／２となる。また、グループ番号５を注目グループ、グループ番号２を参照グループとした場合には、Ｐｓ＝ｍａｘ（１／３，１／３）＝１／３となる。
【００３６】
６）参照外画像参照画像位置存在数；Ｐｖ
注目画像から見て、参照画像の位置で参照グループ以外の画像が占める面積の割合である。図５により説明する。
図５において、（Ａ）は図４の（Ａ）と同じ行の同一画像グルーピングの結果を示している。グループ番号２を注目グループ、グループ番号４を参照グループとした場合、図５（Ｂ）に示すように、注目画像から見て参照画像の位置でグループ番号３の画像が占める面積（斜線領域）の、注目画像の面積（斜線領域）に対する割合がＰｖであり、例えばＰｖ＝０．１８となる。なお、参照外画像の数は考慮しない。例えば、グループ番号３の画像が図５（Ｂ）のような位置関係に来る場合が２つ以上あっても、Ｐｖは変わらない。また、グループ番号５を注目グループ、グループ番号２を参照グループとした時、グループ番号２の画像が占める面積割合は０．８２、グループ番号６の画像が占める面積割合は０．８７であり、これ以外に参照グループ２と同じ位置関係になる参照外画像はないので、Ｐｖ＝０．８２＋０．８７＝１．６９となる。
【００３７】
ステップＳ１０５において、前ステップで得られたパラメータに関し、次の全ての条件が満たされる時には、その注目画像と参照画像の組は統合される可能性があると判断し、ステップＳ１０６で、その注目画像と参照画像に統合可マークを付け次のステップＳ１０７へ進む。条件の一つでも満たされないときには、ステップＳ１０６はスキップされる。
【００３８】
条件ａ：Ｄ＜Ｄ＿ｔｈ，
ただし、Ｄ＿ｔｈは予め定められた値である。
条件ｂ：（Ｐｓ＞Ｐｓ＿ｔｈ）又は（Ｐｖ＜Ｐｖ＿ｔｈ）
ただし、Ｐｓ＿ｔｈ，Ｐｖ＿ｔｈは予め定められた値である。
条件ｃ：（Ｎ＞Ｎ＿ｔｈ）かつ（ｎ＞ｎ＿ｔｈ）かつ（ｍ＞ｍ＿ｔｈ）
ただし、Ｎ＿ｔｈ，ｎ＿ｔｈ，ｍ＿ｔｈは予め定められた値である。
なお、上記条件ａは、あまりに離れた画像は統合すべきでないと判断するため設けられている。
【００３９】
ステップＳ１０７において、現在の注目グループに関し、未だ参照グループとして選ばれていない同一画像グループ（単独文字画像と判断されたグループを除く、異常画像グループも含む）が残っているか調べ、残っているならばステップＳ１０３に戻り、参照グループを改めて選び、ステップＳ１０４以下の処理を実行する。
【００４０】
現在の注目グループに対し全ての同一画像グループを参照グループとして処理を行ったならば、ステップＳ１０８において、注目グループに選ぶべき同一画像グループ（単独文字画像と判断されたグループは除く）が残っているか調べる。残っていなければステップＳ１０９に進むが、残っているならばステップＳ１０２に戻り、次の注目グループを選び、ステップＳ１０３以下の処理を行う。
【００４１】
ステップＳ１０９において、異常画像グループを除いた同一画像グループの中から、単独文字画像と判断された以外の一つの同一画像グループを選ぶ。次のステップＳ１１０において、前ステップで選んだグループ内の画像の数Ｔと、そのグループ内の画像の中で統合可マークが付いている画像の割合Ａｌを計算し、
Ａｌ≦Ａｌ＿ｔｈかつＴ＞Ｔ＿ｔｈ
（Ａｌ＿ｔｈ，Ｔ＿ｔｈは予め定められる値）
のときに、当該グループは単独文字画像と判断し（グループ内の全ての画像が単独文字画像とされる）、ステップＳ１１１においてカウンタＣＴをインクリメントする。そして、ステップＳ１０９に戻って別の１グループを選び、ステップＳ１１０の条件判定を行い、条件を満たすときにはカウンタＣＴをインクリメントする処理を繰り返す。そして、ステップＳ１１２において、異常画像グループと既に単独文字画像と判断されたグループを除いた全てのグループについて処理したと判定すると、ステップＳ１１３でカウンタＣＴが０であるか判定する。
【００４２】
ＣＴ＝０ならば、新たに単独文字画像とされた同一画像グループが一つも見つからなかったということであるので処理は終了するが、ＣＴ≠０ならば、新たに単独文字画像とされたグループが見つかったということであり、さらに新たな単独文字画像が見つかる可能性があるためステップＳ１０１に戻り、処理を再開する。
【００４３】
【発明の効果】
請求項１乃至７の各項記載の発明によれば、文字認識の前処理として、入力画像中の画像が単独文字画像（単独で１文字を構成する画像）であるか否かの判断を行うため、その後の文字切り出しにおいて、とり得るパスが減るため処理負担が軽減するとともに、画像の統合／分離の誤りも生じにくくなり文字切り出しの精度が向上する。また、個々の画像のサイズに着目するのではなく、同一形状の画像毎に画像をグループ化し、あるグループ内の画像と他のグループ内の画像の位置関係に着目した知識を利用することによって、単独文字画像であるか否かを判断するため、プロポーショナル印字の英文書や英文／英単語の混在した日本語文書等に対しても、前処理において単独文字画像か否かの判断を精度よく行うことができ、したがって文字切り出しの精度も向上する、等の効果を得られる。
【図面の簡単な説明】
【図１】本発明による文字認識システムのブロック図である。
【図２】文字切り出し前処理のフローチャートである。
【図３】同一画像グルーピング処理のフローチャートである。
【図４】統合可能性判断用パラメータの説明のための図である。
【図５】統合可能性判断用パラメータの説明のための図である。
【符号の説明】
５０画像入力部
５１行切り出し部
５２文字切り出し前処理部
５２Ａ同一画像グルーピング部
５２Ｂグループ間パラメータ算出部
５２Ｃ単独文字画像抽出部
５３文字切り出し部
５４文字認識部
５５出力部[0001]
[Field of the Invention]
The present invention relates to a character segmentation technology of a character recognition device.
[0002]
[Prior art]
In OCR, character segmentation may not work properly due to separation / connection of character images due to image collapse / shading. If character segmentation is not normal, not only is the recognition result erroneous, but the work of correcting the recognition result becomes complicated, and the burden on the user increases significantly.
[0003]
In order to correctly cut out a character image, it is necessary to perform such integration / division of the separated / connected character images. However, conventionally, as described in Japanese Patent Application Laid-Open No. 1-114992, the size of the image is large. On the basis of this, a method of judging integration / separation of images has often been adopted.
[0004]
[Problems to be solved by the invention]
However, for proportionally printed English documents and Japanese documents with mixed English / English words, it is difficult to judge the separation / connection of images using only the size of each image. I can't expect it.
[0005]
An object of the present invention is to provide a character extraction method and apparatus capable of improving the character extraction accuracy for such a document.
[0006]
[Means for Solving the Problems]
According to the present invention, as a pre-process of character recognition, it is determined whether or not an image in an input image is a single character image (an image that independently forms one character). By performing such preprocessing, the number of paths that can be taken in subsequent character extraction is reduced, so that the processing load is reduced. In addition, errors in image integration / separation are less likely to occur, and the accuracy of character extraction is improved.
[0007]
However, depending on the method of focusing on the size of each image, it is expected that such a single character image can be accurately determined in a proportionally printed English document or a Japanese document in which English / English words are mixed. Can not.
[0008]
In order to accurately determine whether such a document is a single character image or not, the present invention utilizes knowledge of the sequence of images. Image A1, A2 of the same shape in the "input image with this knowledge, ... there are a plurality, each image A1, A 2, if ... are alone character image, A1, A2, It is likely that various different images will come before and after ... On the other hand, if the images A1, A2, ... are separated images of character images, An image a1, a2, ... of the same shape will come, and this can be applied to a proportionally printed English document or a Japanese document mixed with English sentences / words. " is there. Therefore, even for a proportionally printed English document or a Japanese document in which English sentences / words are mixed, it is possible to accurately determine whether or not the image is a single character image in the pre-processing. Also improve.
[0009]
BEST MODE FOR CARRYING OUT THE INVENTION
Embodiments of the present invention will be described with reference to the drawings to clarify embodiments of the present invention.
[0010]
FIG. 1 is a block diagram of a character recognition system according to one embodiment of the present invention. This block diagram is simplified in association with the flow of processing for convenience of description, and does not faithfully represent an actual hardware configuration. In practice, this character recognition system requires one or more memories for storing images and other data, one or more memories for storing fixed information such as programs for processing, according to the program. It is composed of a CPU that executes various processes and controls, an input / output device for inputting and outputting data to and from the outside, and its interface.
[0011]
In FIG. 1, an image input unit 50 inputs binary image data of a document using an image scanner or the like, and stores it in a memory (not shown). The line cutout unit 51 extracts a circumscribed rectangle of the black run connected component from the input document image, and cuts out the range of the character line by examining the connection relation of the circumscribed rectangle in the character line direction. In the following processing, “image” means an image inside the circumscribed rectangle.
[0012]
The character extraction pre-processing unit 52 is a part related to the gist of the present invention, and extracts an image (single character image) that can be independently determined to be one character image from the images extracted in the line extraction stage. . This processing is performed in units of one line in the present embodiment, but the processing unit may be basically arbitrary, such as a plurality of lines or pages. However, the processing load increases as the processing unit increases. The character extraction preprocessing unit 52 can be divided into an identical image grouping processing unit 52A, an inter-group parameter calculation unit 52B, and a single character image extraction unit 52C from the viewpoint of processing contents, and the processing contents of each part will be described later.
[0013]
The character cutout unit 53 cuts out the image of each character from the image cut out by the line cutout unit. At this time, the image extracted as a single character image by the character cutout preprocessing unit 52 is cut out as a character image as it is. , Out of integration / separation. As described above, when an image that can be determined as a single character image is extracted in the pre-processing of character extraction, the processing amount of character extraction is reduced as compared with the case where character extraction is performed using all images as character candidates, and Improper clipping is less likely to occur.
[0014]
The character recognition unit 54 performs character recognition on each of the cut-out character images. The result of the character recognition is temporarily stored in the memory or output to the outside by the output unit 55 as it is. Note that a method of correcting the character cutout by feeding back the result of the character recognition to the character cutout unit 53 may be adopted.
[0015]
Hereinafter, the processing content of the character extraction pre-processing unit 52, which is a part related to the gist of the present invention, will be described in detail. FIG. 2 is a flowchart illustrating an example of a process of the character segmentation pre-processing unit 52. FIG. 3 is a flowchart showing an example of the process of the first step S100 in FIG. 2, that is, the process of the same image grouping unit 52A in the character extraction pre-processing unit 52.
[0016]
First, the same image grouping process will be described. This process, in the extracted image in the line extracting, aims at grouping image shape can be regarded as the same image.
[0017]
A set of images to be compared for the same or not is selected from the images in the row (step S201), and it is checked whether any of the height difference or the width difference of the images is larger than a predetermined threshold WD_th (step S202). . If at least one of the height difference or the width difference is larger than WD_th, it is determined that the two images are not the same (step S207). If both the difference in height and the difference in width are equal to or smaller than WD_th in step S202, the center of gravity of each image is obtained, and it is checked whether the difference in the position of the center of gravity with respect to the upper left corner of the image is larger than a predetermined threshold ST_th (step S203). ), If they are large, it is determined that the two images are not the same (step S207).
[0018]
If the difference between the centroid positions is equal to or less than ST_th in step S203, the exclusive OR (EOR) image is created by matching the centroid positions of the two images (step S204). This EOR image is subjected to mask processing of 2 × 2 pixels, and it is checked whether three or more black pixels (pixels of two images having different pixel values) are detected in the area of 2 × 2 pixels (step S205). If three or more are detected, it is determined that the two images are not the same (step S207). In step S205, if three or more black pixels are not detected even after processing all the 2 × 2 pixel areas, the two images are determined to be the same (step S206).
[0019]
If it is determined that one set of images is the same or non-identical, and if there is still a set of images that have not been determined, the process returns from step S208 to step S201, and a similar copy of another remaining set of images is performed. Perform sex determination processing.
[0020]
If it is determined that all the image sets are the same / non-identical, a group number unique to the group is assigned to the group of the images determined to be the same (step S209). At this time, if the images A and B are determined to be the same and the images B and C are determined to be the same, even if the images A and C are determined to be non-identical, the images A, B, and C Are assigned the same group number as the same group of images.
[0021]
Finally, for each group, the size of the image in the group is checked, and if even one image with an abnormal size is found in the group, the group is marked as an abnormal image group (step S210). The abnormal size means that the size is too large or too small to be a target character for character recognition. Specifically, when the upper limit of the size of the image is Smax, the lower limit is Smin, the width of the image is RW, and the height of the image is RH, an image satisfying any of the following conditions is regarded as an abnormal image.
Condition 1: RW> Smax or RH> Smax
Condition 2: RW <Smin and RH <Smin
[0022]
Returning to the flowchart of FIG. 2, the processing after step S101 will be described. Before that, the outline of the single character image extraction processing will be described.
[0023]
Basically, it focuses on the images in a certain image group (excluding the attention group and abnormal image group), and integrates them with the images in other same image groups (which may be reference groups and abnormal image groups). To determine if there is a possibility that That is, the possibility of being a separated character image, that is, the possibility of not being a single character image is determined.
[0024]
N images A1, A2,. . . , An, and m images B1, B2,. . . , Bm, and there is a set of images having the same positional relationship, for example, between the images in each group.
C1 = (A2, B1)
C2 = (A3, B3)
. . .
CN = (An_1, Bm)
For these sets of images, the possibility of integration is determined using inter-group parameters described later. It is assumed that similar determinations are made for all combinations of groups, and the results shown in Table 1 are obtained for the images of group Ga, for example.
[0025]
[Table 1]

[0026]
Table 1 shows that among the images A1 to A8 of the group Ga, A1 can be integrated with the image of the group Gb, A2 can be integrated with the image of the group Gc, and A3 is the image of the group Gc and the group Gd. A4 may be integrated with the image of group Gb, and A4 may be integrated with the image of group Gb. . . , A 6 have no possibility of integration with any group of images, and. . . . , A8 indicate that there is a possibility of integration with the image of the group Gd. That is, in this example, of the eight images in the group Ga, seven images may be integrated with at least one other group of images. The ratio of the images that may be integrated is assumed to be Al, and if this ratio is smaller than a predetermined threshold value Al_th, the image of the group Ga is determined to be a single character image. In other words, the knowledge that "if the image of the group Ga is a single character image, various character images should come before and after each image and the ratio Al should be small compared to the case of the partial character image". In this example, if Al = 7/8 and Al_th = 0.8, it is determined that the image of the group Ga is not a single character image.
[0027]
Return to FIG. In step S101, the counter CT is initialized to zero. This counter CT is a counter that is incremented in step S111 when a single character image is extracted in the processing loop. When a new single character image is not extracted, the counter CT is determined to be terminated in step S11 3 for remain in a zero, the process ends. In other words, the process is repeated until no single character image is extracted.
[0028]
In step S102, from among the same image groups obtained by the same image grouping process, one group that is not an abnormal image group and that is not a group already extracted as a single character image is selected as a target group.
[0029]
In the next step S103, one identical image group that is neither a group already extracted as a single character image nor the current reference group is selected as a reference group from the same image group (including the abnormal image group).
[0030]
In step S104, for the image in the group of interest (image of interest) and the image in the reference group (reference image), a set of the image of interest and the reference image having the same positional relationship is collected, and the target image and the reference image of the same positional relationship are collected. The following inter-group parameters D, N, n, m, Ps, and Pv are calculated for the set (process of the inter-group parameter calculation unit 52B). At this time, for example, when the relative positional relationship between the centers of gravity is within the range of ± 1 pixel in both the vertical and horizontal directions, it is determined that those images have the same positional relationship. For example, if the center of gravity of the reference image B2 is 3 pixels to the right of the center of gravity of the image of interest A2 and 11 pixels above it, and the center of gravity of the reference image B3 is 2 pixels to the right of the center of gravity of the image of interest A3 and 11 pixels above, It is determined that the image sets have the same positional relationship.
[0031]
1) Distance between images; D
This is the distance between the center of gravity of the target image and the reference image.
[0032]
2) Number of reference image same position; N
This is the number of reference images having the same positional relationship when viewed from the target image. A specific description will be given with reference to FIG.
It is assumed that seven groups from group number 1 to group number 7 as shown in FIG. 4A are extracted from a certain row by the same image grouping process. When the group number 2 is a group of interest and the group number 4 is a reference group, there are two cases where the image of interest and the reference image have the same positional relationship as shown in FIG. That is, N = 2. Also, when the group number 5 is the target group and the group number 2 is the reference group, N = 1 as shown in FIG. 4C.
[0033]
3) Number of images in the group of interest; n
This is the number of images included in the same image group selected as the attention group. This may be obtained in advance at the stage of the same image grouping process. In the example shown in FIG. 4A, when the group number 2 is set as the target group, n = 3.
[0034]
4) Number of images in reference group; m
This is the number of images included in the same image group selected as the reference group, and may be obtained in advance in the same image grouping process. In the example of FIG. 4A, when group number 4 is set as a reference group, m = 2. When the reference group is an abnormal image group, a value that satisfies m> m_th is set to m (m_th is a predetermined constant value).
[0035]
5) Reference image existence probability; Ps
This is the probability that the reference images have the same positional relationship with respect to the target image, and is specifically calculated by the following equation.
Ps = max (N / m, N / n)
In the example of FIG. 4A, when the group number 2 is the target group and the group number 4 is the reference group, Ps = max (2/2, 2/3) = 2/2. When the group number 5 is the target group and the group number 2 is the reference group, Ps = max (, １／) = 1/3.
[0036]
6) Number of non-reference image reference image positions; Pv
This is the ratio of the area occupied by images other than the reference group at the position of the reference image when viewed from the target image. This will be described with reference to FIG.
In FIG. 5, (A) shows the result of the same image grouping in the same row as (A) in FIG. When the group number 2 is a target group and the group number 4 is a reference group, as shown in FIG. 5B, the area (shaded area) occupied by the image of the group number 3 at the position of the reference image as viewed from the target image. , The ratio to the area of the image of interest (the hatched area) is Pv, for example, Pv = 0.18. Note that the number of non-reference images is not considered. For example, Pv does not change even if there are two or more cases where the image of group number 3 comes in a positional relationship as shown in FIG. When the group number 5 is the target group and the group number 2 is the reference group, the area ratio occupied by the image of the group number 2 is 0.82, and the area ratio occupied by the image of the group number 6 is 0.87. Since there is no non-reference image having the same positional relationship as that of the reference group 2, Pv = 0.82 + 0.87 = 1.69.
[0037]
In step S105, when all of the following conditions are satisfied with respect to the parameters obtained in the previous step, it is determined that the set of the target image and the reference image is likely to be integrated, and in step S106, the target image is determined. Then, a mark that can be integrated is attached to the reference image and the process proceeds to the next step S107. If even one of the conditions is not satisfied, step S106 is skipped.
[0038]
Condition a: D <D_th,
Here, D_th is a predetermined value.
Condition b: (Ps> Ps_th) or (Pv <P v_th)
However, Ps_th and Pv_th are predetermined values.
Condition c: (N> N_th) and (n> n_th) and (m> m_th)
Here, N_th, n_th, and m_th are predetermined values.
The condition a is provided to determine that images that are too far apart should not be integrated.
[0039]
In step S107, with respect to the current attention group, it is checked whether or not the same image group (excluding the group determined to be a single character image and including an abnormal image group) remains that has not been selected as a reference group. Returning to step S103, a reference group is selected again, and the processing from step S104 is performed.
[0040]
If all the same image groups have been processed with respect to the current group of interest as a reference group, in step S108, is there any remaining same image group to be selected as the group of interest (excluding the group determined to be a single character image)? Find out. If there is no remaining group, the process proceeds to step S109. If there is a remaining group, the process returns to step S102, the next target group is selected, and the processing from step S103 is performed.
[0041]
In step S109, one identical image group other than the single character image is selected from the identical image groups excluding the abnormal image group. In the next step S110, the number T of the images in the group selected in the previous step and the ratio Al of the images in the group that have the mark that can be integrated are calculated,
Al ≦ Al_th and T> T_th
(Al_th and T_th are predetermined values)
At this time, the group is determined to be a single character image (all images in the group are determined to be single character images), and the counter CT is incremented in step S111. Then, returning to step S109, another one group is selected, the condition determination of step S110 is performed, and when the condition is satisfied, the process of incrementing the counter CT is repeated. If it is determined in step S112 that all groups except for the abnormal image group and the group that has already been determined to be a single character image have been processed, it is determined whether the counter CT is 0 in step S113.
[0042]
If CT = 0, it means that no identical image group newly set as a single character image has been found, so the process ends. If CT ≠ 0, the group set as a new single character image is Since it has been found, and there is a possibility that a new single character image may be found, the process returns to step S101 and resumes the process.
[0043]
【The invention's effect】
According to the first to seventh aspects of the present invention, as a preprocessing for character recognition, it is determined whether an image in an input image is a single character image (an image that independently forms one character). Therefore, in the subsequent character segmentation, the number of possible paths is reduced, so that the processing load is reduced. In addition, errors in image integration / separation are less likely to occur, and the accuracy of character segmentation is improved. Also, instead of focusing on the size of individual images, images are grouped for each image of the same shape, and by utilizing knowledge focusing on the positional relationship between images in one group and images in another group, In order to determine whether it is a single character image or not, it is possible to accurately determine whether or not it is a single character image in preprocessing even for proportionally printed English documents and Japanese documents with mixed English / English words. Therefore, the effect of improving the accuracy of character segmentation can be obtained.
[Brief description of the drawings]
FIG. 1 is a block diagram of a character recognition system according to the present invention.
FIG. 2 is a flowchart of character pre-cutout processing;
FIG. 3 is a flowchart of the same image grouping process.
FIG. 4 is a diagram for explaining parameters for determining integration possibility.
FIG. 5 is a diagram for explaining integration possibility determination parameters.
[Explanation of symbols]
Reference Signs List 50 Image input unit 51 Line extraction unit 52 Character extraction preprocessing unit 52A Same image grouping unit 52B Inter-group parameter calculation unit 52C Single character image extraction unit 53 Character extraction unit 54 Character recognition unit 55 Output unit

Claims

The input black run connected component images are grouped for each image having the same shape, and the input individual images are formed using information on the positional relationship between the image in the target group and the image in the reference group. Performing the process of determining whether or not it is a single character image as pre-processing of character segmentation,
In the determining process, information on whether or not the image in the target group and the image in the reference group have the same positional relationship is used to determine whether the image in the target group is a single character image. A character segmentation method characterized by the following.

2. The character segmentation method according to claim 1, wherein, in the determining process, information regarding a distance between an image in the target group and an image in the reference group is used to determine whether or not the image in the target group is a single character image. Character extraction method characterized by being used for

2. The character segmentation method according to claim 1, wherein in the determining process, when it is determined that the image in the target group and the images in the plurality of reference groups do not have the same positional relationship, the image in the target group is determined. Is a character extracting method characterized by determining that the image is not a single character image.

3. The character segmentation method according to claim 1, wherein, in the determining process, after determining whether or not the image in the target group is a single character image using all other groups as reference groups, A final decision is made as to whether all the images in the group of interest are single character images based on the ratio or the number of images determined not to be single character images among the images in the group of interest. Character extraction method.

5. The character extracting method according to claim 1, wherein, in the determining process, whether or not the image is a single character image is performed only for a group including a certain number or more of images. Character extraction method.

The input black run connected component images are grouped for each image having the same shape, and the input individual images are formed using information on the positional relationship between the image in the target group and the image in the reference group. A means for performing a determination process of determining whether or not the image is a single character image as a pre-process of character segmentation,
The means uses information on whether or not the image in the target group and the image in the reference group have the same positional relationship to determine whether the image in the target group is a single character image. Character extraction device.