JP2812704B2

JP2812704B2 - Character extraction device

Info

Publication number: JP2812704B2
Application number: JP1072395A
Authority: JP
Inventors: 敏行射手園; 晃治伊東; 義征山下
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1989-03-24
Filing date: 1989-03-24
Publication date: 1998-10-22
Anticipated expiration: 2013-10-22
Also published as: JPH02252079A

Description

【発明の詳細な説明】（産業上の利用分野）この発明は、罫線を伴なう文字行から文字を切り出す
ための文字切出し装置に関する。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character extracting device for extracting characters from a character line with ruled lines.

（従来の技術）近年、大量の文書の処理効率を高めるために、文書に
記載された情報を機械的に計算機へ入力できるようにす
ることへの要求が高まっている。この機械的入力に利用
される文字認識装置は、印刷され或は手書きされた文字
群の画像から一文字分の画像を一文字単位に分離し（切
出し）、分離した一文字分の画像の文字認識を行なう。(Prior Art) In recent years, in order to increase the processing efficiency of a large amount of documents, there is an increasing demand for enabling information written in the documents to be mechanically input to a computer. The character recognition device used for this mechanical input separates (cuts out) an image of one character from a printed or handwritten character group image in character units, and performs character recognition of the separated image of one character. .

一般に用いられる文書には文字に加え罫線が付加され
ることも多いが、文字認識のためには付加された罫線を
含まないように一文字分の画像の切出を行なう必要があ
る。Generally used documents are often provided with ruled lines in addition to characters, but for character recognition, it is necessary to cut out an image of one character so as not to include the added ruled lines.

罫線を含まないように切出しを行なう装置や方式とし
ては例えば、特開昭60−160487号公報に開示されている
光学的文字読取装置や、特開昭62−217385号公報に開示
されている文字分離方式がある。Examples of a device and a method for cutting out so as not to include a ruled line include an optical character reading device disclosed in JP-A-60-160487 and a character disclosed in JP-A-62-217385. There is a separation method.

前者の従来装置では、読取部からの画像データを格納
した画像メモリ上に仮想的に行方向及び列方向を設定し
（列方向は行方向に垂直な方向である。以下、仮想的に
設定された行方向及び列方向を所定の行方向及び所定の
列方向と称す）、文書の画像データを行方向に走査し、
この走査で黒画素数を行方向に累積して行方向の周辺分
布を作成し、所定の黒画素数を越える周辺分布の領域を
アンダライン部及び所定の黒画素数未満の周辺分布の領
域を文字部として検出し、さらに列方向における文字部
の縁の位置を文字の列方向における始端位置及び終端位
置として検出する。そしてアンダライン部を除き文字部
のみを走査し、この走査で黒画素数を列方向に累積して
列方向の周辺分布を作成し、列方向の累積黒画素数の変
化から文字の行方向における始端及び終端位置を検出す
る。In the former conventional apparatus, a row direction and a column direction are virtually set on an image memory storing image data from the reading unit (the column direction is a direction perpendicular to the row direction. Hereinafter, virtually set). The row direction and the column direction are referred to as a predetermined row direction and a predetermined column direction), and the image data of the document is scanned in the row direction.
In this scanning, the number of black pixels is accumulated in the row direction to create a peripheral distribution in the row direction, and the area of the peripheral distribution exceeding the predetermined number of black pixels is defined as an underline portion and the area of the peripheral distribution less than the predetermined number of black pixels. The position of the edge of the character portion in the column direction is detected as the start position and the end position of the character in the column direction. Then, only the character portion is scanned except for the underline portion, and the number of black pixels is accumulated in the column direction by this scanning to create a peripheral distribution in the column direction. Detect start and end positions.

この従来装置において、罫線が行方向に沿って付加さ
れているときの行方向における文字の始端及び終端位置
を精度良く検出するためには、画像の真の行方向を所定
の行方向に一致させるように原稿等の画像データを得、
周辺分布作成の際に所定の行方向において文字及び罫線
が重なり合うのを少なくするようにする。In this conventional device, in order to accurately detect the start and end positions of characters in the line direction when the ruled line is added along the line direction, the true line direction of the image is made to coincide with the predetermined line direction. Image data such as manuscript
At the time of creating the marginal distribution, the overlapping of characters and ruled lines in a predetermined line direction is reduced.

また後者の従来方式では、アンダライン等の直線を伴
う文字行を包含する画像上に複数の局所領域を設定し、
これら各局所領域の画像を、パターンマッチングによっ
て所定の方向線素に記号化し、さらに方向線素を分類す
る。そしてアンダライン等の文字認識に不要な線分と同
じ方向を有する方向線素（例えば水平直線成分）から構
成される線分のなかから、所定の条件（例えば長さや存
在領域）を有する線分を検出し、検出した線分を不要線
分として画像から除去する。このようにして不要線分を
除去した画像を得たのち、この画像を走査して行及び列
方向の周辺分布を作成し、作成した周辺分布から所定の
行及び列方向における文字の始端位置と終端位置とを検
出していた。In the latter conventional method, a plurality of local regions are set on an image including a character line with a straight line such as an underline,
The images of these local regions are symbolized into predetermined direction elements by pattern matching, and the direction elements are further classified. Then, a line segment having a predetermined condition (for example, length or existing area) is selected from line segments composed of direction line elements (for example, horizontal straight line components) having the same direction as a line segment unnecessary for character recognition such as an underline. Is detected, and the detected line segment is removed from the image as an unnecessary line segment. After obtaining an image from which unnecessary line segments have been removed in this way, this image is scanned to create a peripheral distribution in the row and column directions, and the starting position of the character in a predetermined row and column direction is determined from the created peripheral distribution. The end position was detected.

（発明が解決しようとする課題）しかしながら上述した前者の従来装置では、画像メモ
リ上に設定した所定の行方向が画像の真の行方向からず
れていると、周辺分布作成の際の所定の行方向における
文字及び罫線の重なりが多くなり、従って同一の主走査
線上に文字の黒画素及び罫線の黒画素が混在し、この結
果所定の行方向の周辺分布から文字の位置及び罫線の位
置を精度良く検出することが困難となる。(Problems to be Solved by the Invention) However, in the former conventional apparatus described above, if the predetermined row direction set in the image memory is deviated from the true row direction of the image, the predetermined The overlap of characters and ruled lines in the direction increases, so that black pixels of the character and black lines of the ruled line coexist on the same main scanning line. It is difficult to detect well.

また上述した前者の従来装置では、所定の行方向が真
の行方向からずれていたとしても、同一文字行内におい
ては所定の行方向における文字始端位置及び終端位置を
全て同一とするため、文字の端が欠けて実質的に文字認
識不可能な不完全な文字を切出してしまう。この文字の
欠けは行の端部に近いほどその欠け程度がひどくなる。Further, in the former conventional device described above, even if the predetermined line direction is deviated from the true line direction, the character start and end positions in the predetermined line direction are all the same within the same character line. An incomplete character that is practically unrecognizable due to a missing end is cut out. The degree of the lack of this character becomes more severe near the end of the line.

しかも所定の行方向及び真の行方向が不一致なことに
加え文字と罫線とが接触していると、文字及び罫線位置
の検出精度はさらに悪化し、文字の欠け具合もさらにひ
どくなる。Moreover, if the character and the ruled line are in contact with each other in addition to the mismatch between the predetermined line direction and the true line direction, the accuracy of detecting the position of the character and the ruled line is further deteriorated, and the degree of missing of the character is further increased.

また上述した後者の従来方式では、読取画像を複数の
局所領域に分割し、各局所領域の読取画像と所定の複数
の基本パターンとをマッチングによって比較し、この比
較結果に基づいて局所領域を方向線素に変換する。この
ような局所マッチング処理は複雑であり、従って従来方
式を採用した装置の装置構成は複雑化し、これがため装
置規模の大型化や装置のコスト高を招くという問題点が
あった。In the latter conventional method described above, the read image is divided into a plurality of local areas, the read image of each local area is compared with a predetermined plurality of basic patterns by matching, and the local area is oriented based on the comparison result. Convert to line elements. Such a local matching process is complicated, and therefore, the device configuration of the device adopting the conventional method is complicated, which causes a problem that the size of the device is increased and the cost of the device is increased.

この発明の目的は、上述した従来の問題点を解決する
ため、所定の行方向及び真の行方向が不一致の場合で
も、累積黒画素数の変化に基づいて、所定の行方向及び
列方向における文字の始端位置と終端位置とを、文字が
欠けないように精度良く検出できる文字切出し装置を提
供することにある。SUMMARY OF THE INVENTION An object of the present invention is to solve the above-described conventional problems, so that even when the predetermined row direction and the true row direction do not match, the predetermined row direction and column direction are determined based on the change in the number of accumulated black pixels. It is an object of the present invention to provide a character cutout device that can accurately detect the start position and the end position of a character so that the character is not lost.

（課題を解決するための手段）この目的の達成を図るため、この発明の文字切出し装
置は、罫線付文字行を記載した情報媒体の光学的読取りを行
ない情報媒体の画像データを出力する読取部と、読取部からの画像データを格納する画像データ記憶部
と、文字部領域を罫線を含まないように罫線付文字行領域
内に設定し、画像データ記憶部の文字部領域内の画像デ
ータを走査して所定の列方向の累積黒画素数を走査線毎
に検出し、列方向累積黒画素数の変化に基づいて所定の
行方向の文字始端位置及び終端位置を一文字毎に検出
し、これら行方向の文字始端位置及び終端位置に基づい
て、第一罫線検出領域を所定の行方向の文字始端位置に
隣接する位置であって所定の行方向の文字終端位置とは
反対側に罫線を含むように一文字毎に設定すると共に第
二罫線検出領域を所定の行方向の文字終端位置に隣接す
る位置であって所定の行方向の文字始端位置とは反対側
に罫線を含むように一文字毎に設定する罫線検出領域設
定部と、画像データ記憶部の第一及び第二罫線検出領域内の画
像データを走査して所定の行方向の累積黒画素数を走査
線毎に検出し、行方向累積黒画素数の変化に基づいて第一及び第二罫
線検出領域内の罫線検出処理を一文字毎に行ない、罫線検出処理結果に基づいて、所定の列方向の文字始
端位置及び終端位置を罫線よりも文字部領域側に位置す
るように一文字毎に設定する文字罫線分離部と、読取部、罫線検出領域設定部及び文字罫線分離部の動
作制御を行なうと共に、所定の行方向の文字始端位置及
び終端位置と所定の列方向の文字始端位置及び終端位置
とを罫線付文字行領域内の個々の文字別に出力する制御
部とを備えて成ることを特徴とする。(Means for Solving the Problems) In order to achieve this object, a character extracting device according to the present invention optically reads an information medium in which a character line with a ruled line is described and outputs image data of the information medium. And an image data storage unit for storing image data from the reading unit, and setting a character part area in a character line area with a ruled line so as not to include a ruled line, and storing image data in the character part area of the image data storage part. The scanning is performed to detect the cumulative number of black pixels in a predetermined column direction for each scanning line, and based on a change in the cumulative number of black pixels in the column direction, a character start position and an end position in a predetermined row direction are detected for each character. Based on the character start and end positions in the line direction, the first ruled line detection area includes a ruled line at a position adjacent to the character start position in the predetermined line direction and opposite to the character end position in the predetermined line direction. Set for each character A ruled line detection area setting unit that sets a second ruled line detection area for each character so as to include a ruled line at a position adjacent to a character end position in a predetermined line direction and opposite to a character start position in the predetermined line direction. And scanning the image data in the first and second ruled line detection areas of the image data storage unit to detect the cumulative number of black pixels in a predetermined row direction for each scanning line, and based on the change in the cumulative number of black pixels in the row direction. The ruled line detection processing in the first and second ruled line detection areas is performed for each character, and based on the ruled line detection processing result, the character start and end positions in a predetermined column direction are positioned closer to the character part area than the ruled line. As described above, the operation of the character ruled line separation unit, which is set for each character, the reading unit, the ruled line detection area setting unit, and the character ruled line separation unit is controlled, and the character start position and end position in the predetermined row direction and the predetermined column direction are controlled. Character start and end positions Characterized in that the made and a controller for outputting on an individual character in the ruled line with a character row region.

（作用）このような構成の文字切出し装置によれば、読取部
は、罫線付文字行を記載した情報媒体の光学的読取りを
行ない、この情報媒体の画像データを出力し、画像デー
タ記憶部は、読取部からの画像データを格納する。(Operation) According to the character extracting device having such a configuration, the reading unit optically reads the information medium in which the character line with the ruled line is described, outputs image data of the information medium, and the image data storage unit And the image data from the reading unit.

罫線検出領域設定部は、文字部領域を罫線を含まない
ように罫線付文字行領域内に設定し、そして画像データ
記憶部の文字部領域内の画像データを走査して所定の列
方向の累積黒画素数を走査線毎に検出する。そして、こ
の検出した列方向累積黒画素数の変化に基づいて、所定
の行方向の文字始端位置及び終端位置を一文字毎に検出
する。The ruled line detection area setting unit sets the character part area in the character line area with a ruled line so as not to include a ruled line, and scans the image data in the character part area of the image data storage unit to accumulate the data in a predetermined column direction. The number of black pixels is detected for each scanning line. Then, based on the detected change in the cumulative number of black pixels in the column direction, the character start and end positions in the predetermined row direction are detected for each character.

さらに罫線検出領域設定部は、行方向の文字始端位置
及び終端位置に基づいて、第一罫線検出領域を所定の行
方向の文字始端位置に隣接する位置であって所定の行方
向の文字終端位置とは反対側に罫線を含むように一文字
毎に設定すると共に第二罫線検出領域を所定の行方向の
文字終端位置に隣接する位置であって所定の行方向の始
端位置とは反対側に罫線を含むように一文字毎に設定す
る。Further, the ruled line detection area setting unit sets the first ruled line detection area at a position adjacent to the character start end position in the predetermined line direction and a character end position in the predetermined line direction based on the character start position and the end position in the line direction. A ruled line is set for each character so as to include a ruled line on the opposite side, and the ruled line detection area is a position adjacent to the character end position in the predetermined line direction and opposite to the start end position in the predetermined line direction. Is set for each character to include.

そして文字罫線分離部は、画像データ記憶部の第一及
び第二罫線検出領域内の画像データを走査して所定の行
方向の累積黒画素数を走査線毎に検出し、この行方向累
積黒画素数の変化に基づいて第一及び第二罫線検出領域
内の罫線検出処理を行なう。さらに、罫線検出処理結果
に基づいて、所定の列方向の文字始端位置及び終端位置
を罫線よりも文字部領域側に位置するように一文字毎に
設定する。The character ruled line separation unit scans the image data in the first and second ruled line detection areas of the image data storage unit, detects the cumulative number of black pixels in a predetermined row direction for each scan line, A ruled line detection process in the first and second ruled line detection areas is performed based on a change in the number of pixels. Further, based on the result of the ruled line detection processing, the character start and end positions in a predetermined column direction are set for each character so as to be located closer to the character portion area than the ruled line.

制御部はこれら読取部、罫線検出領域設定部及び文字
罫線分離部の動作制御を行なうと共に、所定の行方向の
文字始端位置及び終端位置と所定の列方向の始端位置及
び終端位置とを罫線付文字行領域内の個々の文字別に出
力する。The control unit controls the operation of the reading unit, the ruled line detection area setting unit, and the character ruled line separating unit, and determines the start and end positions of the character in the predetermined row direction and the start and end positions in the predetermined column direction with a ruled line. Output for each character in the character line area.

上述のように所定の行方向の文字始端位置及び終端位
置を文字部領域内における所定の列方向の累積黒素数の
変化に基づいて検出することによって、これら始端及び
終端位置の検出処理を簡単化できる。しかも文字部領域
を、罫線を含まないように文字行１行の文字部分のみを
包含するように設定することによって、罫線を構成する
黒画素が所定の列方向の累積黒画素数として計数されな
いように或はほとんどわずかしか計数されないようにす
ることができ、この結果、こけら始端及び終端位置の検
出精度を高めることができる。As described above, the process of detecting the start and end positions of the character is simplified by detecting the start and end positions of the character in the predetermined row direction based on the change in the cumulative number of black pixels in the predetermined column direction in the character portion area. it can. Moreover, by setting the character portion area to include only the character portion of one character line so as not to include the ruled line, the black pixels constituting the ruled line are not counted as the cumulative number of black pixels in a predetermined column direction. Or almost only a small number can be counted, and as a result, the detection accuracy of the start and end positions of the moss can be improved.

さらに所定の列方向の文字始端位置及び終端位置の検
出を、第一及び第二罫線検出領域内における所定の行方
向の累積黒画素数の変化に基づいて検出することによっ
て、これら始端及び終端位置の検出処理を簡単化でき
る。Further, the start and end positions of the characters in the predetermined column direction are detected based on the change in the cumulative number of black pixels in the predetermined row direction in the first and second ruled line detection areas, so that these start and end positions are detected. Detection processing can be simplified.

しかも第一罫線検出領域を、所定の行方向の文字始端
位置に隣接する位置であって所定の行方向の文字始端位
置とは反対側の位置に罫線を含むように設定するので、
文字を構成する黒画素が第一罫線検出領域において所定
の行方向の累積黒画素数として計数されないように或は
ほとんど計数されないようにすることができる。その結
果、第一罫線検出領域内の罫線の位置を精度良く検出で
きる。同様に第二罫線検出領域を、所定の行方向の文字
終端位置と隣接する位置であって所定の行方向の文字始
端位置とは反対側の位置に罫線を含むように設定するの
で、文字を構成する黒画素を第二罫線検出領域における
所定の行方向の累積黒画素数として計数されないように
或はほとんど計数されないようにすることができる。そ
の結果、第二罫線検出領域内の罫線の位置を精度良く検
出できる。In addition, since the first ruled line detection area is set to include a ruled line at a position adjacent to the character start position in the predetermined line direction and opposite to the character start position in the predetermined line direction,
It is possible to prevent black pixels constituting a character from being counted or hardly counted as the cumulative number of black pixels in the predetermined row direction in the first ruled line detection area. As a result, the position of the ruled line in the first ruled line detection area can be detected with high accuracy. Similarly, the second ruled line detection area is set to include a ruled line at a position adjacent to the character end position in the predetermined line direction and opposite to the character start position in the predetermined line direction. The constituent black pixels can be prevented from being counted or almost not counted as the cumulative number of black pixels in a predetermined row direction in the second ruled line detection area. As a result, the position of the ruled line in the second ruled line detection area can be detected with high accuracy.

従って第一及び第二罫線検出領域内における罫線の位
置を精度良く検出できるので、文字の欠けを生じないよ
うに或は文字の欠けを従来よりも少なくして文字を切出
すことができる。Therefore, since the positions of the ruled lines in the first and second ruled line detection areas can be detected with high precision, characters can be cut out without causing any missing characters or with less missing characters than before.

（実施例）以下、図面を参照し、この発明の実施例につき説明す
る。尚、図面はこの発明が理解できる程度に概略的に示
してあるにすぎず、従って各構成成分の構成、入出力信
号、入出力信号の流れ、信号線の接続関係、動作の流れ
を図示例に限定するものではない。Hereinafter, embodiments of the present invention will be described with reference to the drawings. It should be noted that the drawings are only schematically shown to the extent that the present invention can be understood, and therefore the configuration of each component, input / output signals, the flow of input / output signals, the connection relation of signal lines, and the flow of operation are illustrated in the drawings. It is not limited to.

装置構成第１図はこの発明の実施例の構成を概略的に示す機能
ブロック図である。FIG. 1 is a functional block diagram schematically showing a configuration of an embodiment of the present invention.

この実施例の文字切出し装置は、読取部10、画像デー
タ記憶部12、罫線検出領域設定部14、文字罫線分離部16
及び制御部18から成る。The character extracting apparatus according to this embodiment includes a reading unit 10, an image data storage unit 12, a ruled line detection area setting unit 14, a character ruled line separating unit 16
And a control unit 18.

読取部10は罫線付文字行を記載した情報媒体の光学的
読取りを行ないこの情報媒体の画像データを出力し、画
像データ記憶部12は読取部10からの画像データを格納す
る。The reading unit 10 optically reads the information medium on which the character line with the ruled line is described, and outputs image data of the information medium. The image data storage unit 12 stores the image data from the reading unit 10.

罫線検出領域設定部14は、文字部領域を罫線を含まな
いように罫線付文字行領域内に設定し、画像データ記憶
部12の文字部領域内の画像データを走査して所定の列方
向の累積黒画素数を走査線毎に検出し、この列方向累積
黒画素数の変化に基づいて、所定の行方向の文字始端位
置及び終端位置を一文字毎に検出する。The ruled line detection area setting unit 14 sets the character part area in the character line area with a ruled line so as not to include a ruled line, scans the image data in the character part area of the image data storage unit 12, and scans the image data in a predetermined column direction. The cumulative number of black pixels is detected for each scanning line, and based on the change in the cumulative number of black pixels in the column direction, the character start and end positions in a predetermined row direction are detected for each character.

そして罫線検出領域設定部14は、所定の行方向の文字
始端位置及び終端位置に基づいて第一及び第二罫線検出
領域を設定する。罫線検出領域設定部14は、第一罫線検
出領域を所定の行方向の文字始端位置に隣接する位置で
あって所定の行方向の文字始端位置とは反対側の位置に
罫線を含むように一文字毎に設定し、さらに第二罫線検
出領域を所定の行方向の文字終端位置に隣接する位置で
あって所定の行方向の文字始端位置とは反対側の位置に
罫線を含むように一文字毎に設定する。Then, the ruled line detection area setting unit 14 sets the first and second ruled line detection areas based on the character start and end positions in the predetermined line direction. The ruled line detection area setting unit 14 sets the first ruled line detection area to a position adjacent to the character start position in the predetermined line direction and one line so as to include a ruled line at a position opposite to the character start position in the predetermined line direction. The second ruled line detection area is set for each character so as to include a ruled line at a position adjacent to the character end position in the predetermined line direction and opposite to the character start position in the predetermined line direction. Set.

また文字罫線分離部16は、画像データ記憶部12の第一
及び第二罫線検出領域内の画像データを走査し所定の行
方向の累積黒画素数を走査線毎に検出し、この行方向累
積黒画素数の変化に基づいて第一及び第二罫線検出領域
内の罫線検出処理を行なう。さらに、この罫線検出処理
結果に基づいて、所定の列方向の文字始端位置及び終端
位置を罫線よりも文字部領域側に位置するように一文字
毎に設定する。Further, the character ruled line separating section 16 scans the image data in the first and second ruled line detection areas of the image data storage section 12, detects the cumulative number of black pixels in a predetermined row direction for each scanning line, and A ruled line detection process in the first and second ruled line detection areas is performed based on the change in the number of black pixels. Further, based on the result of the ruled line detection processing, a character start position and a character end position in a predetermined column direction are set for each character so as to be located closer to the character portion area than the ruled line.

制御部18は、上述の読取部10、罫線検出領域設定部14
及び文字罫線分離部16の動作制御を行なうと共に、所定
の行方向の文字始端位置及び終端位置と所定の列方向の
文字始端位置及び終端位置とを罫線付文字行領域内の個
々の文字別に出力する。The control unit 18 includes the reading unit 10 and the ruled line detection area setting unit 14 described above.
In addition to controlling the operation of the character ruled line separating unit 16, the character start position and end position in the predetermined line direction and the character start position and end position in the predetermined column direction are output for each character in the character line region with ruled lines. I do.

以下、より詳細にこの実施例につき説明する。尚、以
下の説明では所定の行方向の文字始端位置及び終端位置
を行方向始端位置及び行方向終端位置と称し、また所定
の列方向の文字始端位置及び終端位置を列方向始端位置
及び列方向終端位置と称する。Hereinafter, this embodiment will be described in more detail. In the following description, the predetermined character start position and end position in the line direction are referred to as the line direction start position and the line direction end position, and the predetermined character start position and end position in the column direction are referred to as the column direction start position and column direction. Called the end position.

（読取部）この実施例の読取部10は、図示せずも、原稿等の記録
媒体からの反射光Ｐを白黒２値の量子化された電気信号
（画像データ）に変換しこの画像データを画素単位に出
力する光電変換部と、記録媒体の走査のために光電変換
部及び記録媒体を相対的に移動させる走査機構とを備え
て成る。(Reading Unit) The reading unit 10 of this embodiment converts the reflected light P from a recording medium such as a manuscript into a black and white binary quantized electric signal (image data), not shown, and converts this image data. It comprises a photoelectric conversion unit for outputting in pixel units, and a scanning mechanism for relatively moving the photoelectric conversion unit and the recording medium for scanning the recording medium.

（画像データ記憶部）この実施例の画像データ記憶部12は、読取部10からの
画像データを走査順次に格納する画像メモリを用いて構
成され、画像データを１行分以上記憶する。(Image Data Storage Unit) The image data storage unit 12 of this embodiment is configured using an image memory that stores the image data from the reading unit 10 in a scanning order, and stores one line or more of the image data.

画像メモリ上にはＸ−Ｙ座標系を設定し、例えば横書
文書を想定してＸ軸方向を所定の行方向及びＹ軸方向を
所定の列方向とする。An XY coordinate system is set on the image memory. For example, assuming a horizontally written document, the X-axis direction is a predetermined row direction and the Y-axis direction is a predetermined column direction.

画像の真の行方向はＸ軸方向と一致していてもよいし
Ｘ軸方向から多少ずれていてもよい。The true row direction of the image may coincide with the X-axis direction or may slightly deviate from the X-axis direction.

文字の欠けを生ずることなく文字の切出しを行なうた
めには画像の真の行方向をＸ軸方向と一致させるのが最
も好ましいが、真の行方向がＸ軸方向から多少ずれてい
ても文字の欠けを従来よりも少なくし或は文字の欠けを
生ずることなく文字の切出しを行なえる。It is most preferable that the true line direction of the image coincides with the X-axis direction in order to cut out the characters without causing the loss of characters. However, even if the true line direction is slightly deviated from the X-axis direction, Characters can be cut out with less chipping than before or without chipping.

（罫線検出領域設定部）この実施例の罫線検出領域設定部14は、文字部領域を
設定し文字部領域内における所定の列方向の累積黒画素
数（列累積画素数）を検出する第一周辺分布作成部20
と、検出された列累積黒画素を記憶する第一周辺分布記
憶部22と、行方向始端位置及び行方向始端位置を検出す
る黒ブロック検出部24と、第一及び第二罫線検出領域を
設定する検出領域位置決定部26とから成る。(Ruling Line Detection Area Setting Unit) The ruled line detection area setting unit 14 of this embodiment sets a character area and detects the cumulative number of black pixels (column cumulative pixel number) in a predetermined column direction in the character area. Marginal distribution generator 20
And a first peripheral distribution storage unit 22 for storing the detected column accumulated black pixels, a black block detection unit 24 for detecting a row direction start position and a row direction start position, and a first and a second ruled line detection area are set. And a detection area position determining unit 26 that performs the detection.

第一周辺分布作成部20は、１行の文字行とこの文字行
に付加さた罫線とを含む罫線付文字行領域の、所定の列
方向における一方の端縁位置Y_T1及び他方の端縁位置Y_B1
を制御部18から入力する。そして、例えば次式（１）及
び（２）に従って、所定の列方向における文字部領域の
一方の端縁位置Y_T2及び他方の端縁位置Y_B2を設定する。The first marginal distribution creating unit 20 includes one edge position Y _T1 and the other edge in a predetermined column direction of a ruled character line region including one character line and a ruled line added to the character line. Position Y _B1
Is input from the control unit 18. Then, for example, in accordance with the following equation (1) and (2), sets an edge position Y _T2 and the other end edge positions Y _B2 one character region in a given column.

Y_T2＝Y_T1＋α ……（１） Y_B2＝Y_B1−α ……（２） αは、文字と罫線との離間距離及び真の行方向の所定
の行方向に対する行の傾斜が所定の範囲内で変動すると
想定して、任意好適に設定した定数値例えばα＝16であ
る。尚、αを位置Y_T1及びY_B1の離間距離に比例した値、
例えばα＝（Y_T1−Y_B1＋１）/8とするようにしてもよ
い。Y _T2 = Y _T1 + α (1) Y _B2 = Y _B1 −α (2) α is a predetermined distance between the character and the ruled line and a predetermined inclination of the true line direction with respect to the predetermined line direction. The constant value arbitrarily and suitably set, for example, α = 16, assuming that it varies within the range. Note that α is a value proportional to the separation distance between the positions Y _T1 and Y _B1 ,
For example, α = (Y _T1 −Y _B1 +1) / 8 may be set.

そして第一周辺分布作成部30は、位置Y_T2及びY_B2に基
づいて画像データ記憶部12の文字部領域内の画像データ
を走査して、所定の行方向の各位置Ｘ毎に、所定の列方
向の走査線上の黒画素を累積して列累積画素数を検出
し、検出した列累積画素数を第一周辺分布記憶部22に保
存する。Then, the first marginal distribution creating unit 30 scans the image data in the character portion area of the image data storage unit 12 based on the positions _YT2 and _YB2 , and, for each position X in a predetermined row direction, a predetermined The number of column accumulated pixels is detected by accumulating black pixels on the scanning lines in the column direction, and the detected column accumulated pixel number is stored in the first peripheral distribution storage unit 22.

また黒ブロック検出部24は、各位置Ｘ毎に検出された
列累積画素数を、位置Ｘの小さいほう（或は大きいほ
う）から順に読出し、読出した列累積画素数を所定の閾
値と比較する。この比較によって列累積画素数の変化を
調べ、列累積画素数が閾値未満から閾値以上に変化する
とき当該閾値以上の列累積画素数の位置Ｘを行方向始端
位置X_L（或は行方向終端位置X_R）とし、また列累積画素
数が閾値以上から閾値未満に変化するとき当該閾値以上
の列累積画素数の位置Ｘを行方向終端位置X_R（或は行方
向始端位置X_L）として検出する。このようにして検出さ
れた位置X_L及びX_Rの間には１個の文字が、存在するとみ
なせる。以下、罫線付文字行領域内の位置X_L及びX_Rの間
の領域を一文字領域と称す。Further, the black block detection unit 24 sequentially reads the column accumulated pixel number detected for each position X from the smaller (or larger) position X, and compares the read column accumulated pixel number with a predetermined threshold. . Investigated changes in column cumulative number of pixels by the comparison, the row direction starting end position X _L (one row direction end position X number of columns cumulative pixels above the threshold value when a column cumulative number of pixels is changed from less than a threshold value above the threshold Position X _R ), and when the column cumulative pixel number changes from the threshold value to the threshold value, the position X of the column cumulative pixel number not less than the threshold value is defined as the row direction end position X _R (or the row direction start position X _L ). To detect. Single character between this way detected position X _L and X _R are considered to be present. Hereinafter referred to the region between the position X _L and X _R borders with character line area as the character region.

黒ブロック検出部42は行領域内の個々の文字毎に、位
置X_L及びX_Rを検出する。For each individual character of the black block detecting section 42 row area, to detect the position X _L and X _R.

また検出領域位置決定部26は、第一罫線検出領域を行
方向始端位置X_Lに隣接しかつ行方向終端位置X_Rとは反対
側の位置に罫線を含むように設定し、従って第一罫線検
出領域を一文字領域の外側に一文字領域の行方向始端位
置X_L側に隣接させて設定する。The detection area position determining unit 26, an adjacent Shikatsu row direction end position X _R in the first ruled-line detection region row direction starting end position X _L is set so as to include a border position on the opposite side, thus the first ruled line a detection region outside the character region adjacent to the row direction starting end position X _L side of character area is set.

第一罫線検出領域内の罫線の検出精度を向上するた
め、第一罫線検出領域の所定の行方向における幅を隣接
する文字間の距離以内とし、文字を構成する黒画素が第
一罫線検出領域内に包含される個数を低減するのがよ
い。In order to improve the detection accuracy of the ruled line in the first ruled line detection area, the width of the first ruled line detection area in a predetermined row direction is set to be within the distance between adjacent characters, and the black pixels constituting the character are in the first ruled line detection area. It is good to reduce the number included in.

同様に検出領域位置決定部26は、行方向終端位置X_Rに
隣接しかつ行方向始端位置X_Lとは反対側の位置に罫線を
含むように設定し、よって第二罫線検出領域を一文字領
域の外側に一文字領域の行終端位置X_R側に隣接させて設
定する。第二罫線検出領域内の罫線の検出精度を向上す
るため、第二罫線検出領域の所定の行方向における幅を
隣接する文字間の距離以内の幅とし、文字を構成する黒
画素が第二罫線検出領域内に包含される個数を低減する
のがよい。Similarly detection area position determining unit 26 sets to include a border position opposite to the row direction end position X _R on adjacent Shikatsu row direction starting end position X _L, thus character region and the second ruled line detection area outside adjacent to the row end position X _R side of character area of the set by. In order to improve the detection accuracy of the ruled line in the second ruled line detection area, the width of the second ruled line detection area in a predetermined line direction is set to a width within the distance between adjacent characters, and the black pixels constituting the character are defined by the second ruled line. It is preferable to reduce the number included in the detection area.

検出領域位置決定部26は文字行の個々の文字毎に第一
及び第二罫線検出領域を設定する。The detection area position determination unit 26 sets the first and second ruled line detection areas for each individual character in the character line.

例えば次式（３）及び（４）式に従って第一罫線検出
領域を、また例えば次式（５）及び（６）に従って第二
罫線検出領域を設定する。For example, the first ruled line detection area is set according to the following equations (3) and (4), and the second ruled line detection area is set according to the following equations (5) and (6).

但し、所定の行方向の座標Ｘが次第に大きくなる方向
へ順次に数えてｎ番目の文字（以下、単にｎ個目の文字
と称す）の位置X_LをX_L（ｎ）及び位置X_RをX_R（ｎ）、ｎ
番目の文字の第一罫線検出領域の所定の行方向における
一方の端縁位置をX_LL（ｎ）及び他方の端縁位置をX
_LR（ｎ）、及びｎ番目の文字の第二罫線検出領域の所定
の行方向における一方の端縁位置をX_RL（ｎ）及び他方
の端縁位置をX_RR（ｎ）、文字行１行内で検出された文
字の総個数をn_LS、ｎを１≦ｎ≦n_LSの範囲の自然数、β
を任意好適に設定される定数とする。尚、読取部10の光
学的読取の解像度を16画素/mmとした場合には例えばβ
＝32とすればよい。However, sequentially counted n-th character coordinate X of a predetermined row direction to progressively larger direction (hereinafter, simply referred to as n-th character) the position X _L and X _L (n) and the position X _R of X _R (n), n
X _LL (n) represents one edge position of the first ruled line detection area of the second character in the predetermined line direction, and X represents the other edge position.
_LR (n), one edge position in a predetermined line direction of the second ruled line detection area of the n-th character is X _RL (n), the other edge position is X _RR (n), and one character line is within one line. _Is the total number of characters detected in, n is a natural number in the range of 1 ≦ n ≦ n _LS , β
Is a constant that is arbitrarily and suitably set. If the resolution of the optical reading of the reading unit 10 is 16 pixels / mm, for example, β
= 32.

X_LR（ｎ）＝X_L（ｎ）−１ ……（３）・ｎ＝１またはX_L（ｎ）−１−β＞X_R（ｎ−１）＋１の
とき X_LL（ｎ）＝X_L（ｎ）−１−β ・ｎ≧２かつX_L（ｎ）−１−β≦X_R（ｎ−１）＋１のと
き X_LL（ｎ）＝X_R（ｎ−１）＋１ ……（４） X_RL（ｎ）＝X_R（ｎ）＋１ ……（５）・ｎ＝n_LSまたはX_R（ｎ）＋１＋β＜X_L（ｎ＋１）−１
のとき X_RR（ｎ）＝X_R（ｎ）＋１＋β ・ｎ≦n_LS−１かつX_R（ｎ）＋１＋β≧X_L（ｎ＋１）−
１のとき X_RR（ｎ）＝X_L（ｎ＋１）−１ ……（６）（文字罫線分離部）文字罫線分離部16は、第一罫線検出領域内における所
定の行方向の累積黒画素数（以下、第一行累積画素数と
称す）、及び、第二罫線検出領域内における所定の行方
向の累積黒画素数（以下、第二行累積画素数と称す）を
検出する第二周辺分布作成部28と、検出した第一及び第
二行累積画素数を記憶する第二周辺分布記憶部30と、第
一行累積画素数の変化に基づいて第一罫線検出領域内の
罫線検出処理を行なうと共に第二行累積画素数の変化に
基づいて第二罫線検出領域内の罫線検出処理を行なう罫
線検出部32と、罫線検出部32の罫線検出処理結果に基づ
いて列方向始端位置及び列方向終端位置を罫線よりも文
字部領域側に設定する罫線分離位置決定部34とから成
る。X _LR (n) = _XL (n) −1 (3) • When n = 1 or _XL (n) -1-β> X _R (n−1) +1, X _LL (n) = X _{L (n) -1-β ·} n ≧ 2 and _{X L (n) -1-β} ≦ X R (n-1) +1 when _{_{X LL (n) = X R}} (n-1) +1 ...... ( 4) X _RL (n) = X _R (n) +1 (5) n = n _LS or X _R (n) + 1 + β < _XL (n + 1) -1
_{_{X RR (n) = X R}} (n) + 1 + β · n ≦ n LS -1 and _{X R (n) + 1 +} β ≧ X L (n + 1) when the -
_XRR (n) = _XL (n + 1) -1 when (1) (character ruled line separating unit) The character ruled line separating unit 16 determines the cumulative number of black pixels in a predetermined row direction in the first ruled line detection area. (Hereinafter, referred to as a first row cumulative pixel number), and a second peripheral distribution for detecting a cumulative black pixel number in a predetermined row direction (hereinafter, referred to as a second row cumulative pixel number) in the second ruled line detection area. A creating unit 28, a second peripheral distribution storage unit 30 that stores the detected first and second row accumulated pixel numbers, and a ruled line detection process in the first ruled line detection area based on a change in the first row accumulated pixel number. A ruled line detection unit 32 for performing a ruled line detection process in the second ruled line detection area based on the change in the cumulative number of pixels in the second line, and a starting position and a column direction in the column direction based on the ruled line detection processing result of the ruled line detection unit 32 And a ruled line separation position determining unit 34 for setting the end position to the character portion area side of the ruled line.

この実施例では、第二周辺分布作成部28は、画像デー
タ記憶部12の第一罫線検出領域内の画像データを走査し
て、所定の列方向の各位置Ｙ毎に、所定の行方向の走査
線上の黒画素を累積して第一行累積画素数を検出し、検
出した第一行累積画素数を第二周辺分布記憶部30に保存
する。同様にして第二周辺分布作成部28は、第二行累積
画素数を検出した第二周辺分布記憶部30に保存する。In this embodiment, the second peripheral distribution creating unit 28 scans the image data in the first ruled line detection area of the image data storage unit 12 and, for each position Y in a predetermined column direction, in a predetermined row direction. The number of black pixels on the scanning line is accumulated to detect the first row accumulated pixel number, and the detected first row accumulated pixel number is stored in the second peripheral distribution storage unit 30. Similarly, the second marginal distribution creating unit 28 stores the second marginal accumulated pixel number in the detected second marginal distribution storage unit 30.

また罫線検出部32は、所定の列方向における第一罫線
検出領域の一方の側の端縁位置をY_T1とし、第一罫線検
出領域内の文字行余白部分の位置Y_LS1から第一罫線検出
領域の一方の側の端縁位置Y_T1へＹを順に小さくしてゆ
き第一行累積画素数を読出し（但しY_LS1＞Y_T1であってY
_LS1からY_T1までの間に罫線及び余白が含まれるようにY
_LS1を設定する。Y_LS1＝Y_T2とすればよい。）、読出した
第一行累積画素数を所定の閾値と比較する。この比較に
よって行累積画素数の変化を調べる。The ruled line detection unit 32, the edge positions of one side of the first ruled line detection region and Y _T1 at a predetermined column direction, the first ruled lines detected from the position Y _LS1 character line margin of the first ruled line detection area Y is sequentially reduced to the edge position Y _T1 on one side of the area, and the first row accumulated pixel number is read out (where Y _LS1 > Y _T1 and Y
Y so that ruled lines and margins are included between _LS1 and Y _T1
Set _LS1 . It is sufficient to set Y _LS1 = Y _T2 . ), And compare the read first row accumulated pixel count with a predetermined threshold value. By this comparison, a change in the row accumulated pixel number is examined.

第一行累積画素数が閾値未満から閾値以上に変化した
ら当該閾値以上の第一行累積画素数の位置Ｙを第一の罫
線検出処理結果Y_LTとして検出する。但し、第一行累積
画素数が最初に変化したときの位置Ｙが第一の罫線検出
処理結果Y_LTであるとする。When the cumulative number of pixels in the first row changes from less than the threshold to greater than or equal to the threshold, the position Y of the cumulative number of pixels in the first row that is greater than or equal to the threshold is detected as the first ruled line detection processing result _YLT . However, it is assumed that the position Y when the first row accumulated pixel number first changes is the first ruled line detection processing result _YLT .

第一行累積画素数の変化を検出しないまま位置Y_LS1か
らY_T1までの第一行累積画素数を全て読出し終えたとき
には、第一の罫線検出結果Y_LTを端縁位置Y_T1とする。When finishing the first line reads all the cumulative number of pixels from the first row accumulated without detecting a change in the number of pixel positions Y _LS1 to Y _T1 is the first ruled line detection result Y _LT and edge position Y _T1.

さらに罫線検出部32は、所定の列方向における第一罫
線検出領域の他方の側の端縁位置をY_B1とし、第一罫線
領域内の文字行余白部分の位置Y_LS2から第一罫線領域の
他方の側の端縁位置Y_B1へＹを順に大きくして第一行累
積画素数を読出し（但し、Y_LS2＜Y_B1であってY_LS2からY
_B1までの間に罫線及び余白を含むようにY_LS2を設定す
る。例えばY_LS2＝Y_B2とすればよい。）、第一行累積画
素数の変化を調べる。Furthermore ruled line detection unit 32, the other side of the first ruled line detection area in a given column direction edge positions and Y _B1, from the position Y _LS2 character line margin of the first border region of the first ruled line area Y is sequentially increased to the edge position Y _B1 on the other side, and the accumulated pixel number of the first row is read out (however, Y _LS2 <Y _B1 and Y _LS2 to Y _B1
Set _YLS2 to include ruled lines and margins before _B1 . For example, Y _LS2 = Y _B2 may be set. ), And examine the change in the cumulative number of pixels in the first row.

第一行累積画素数が閾値未満から閾値以上に変化した
ら当該閾値以上の第一行累積画素数の位置Ｙを第二の罫
線検出処理結果Y_LBとして検出する。但し、第一行累積
画素数が最初に変化したときの位置Ｙが第二の罫線検出
処理結果Y_LBであるとする。If the first row cumulative pixel number changes from less than the threshold value to the threshold value or more, the position Y of the first row cumulative pixel number that is equal to or greater than the threshold value is detected as the second ruled line detection processing result _YLB . However, the position Y when the first row cumulative number of pixels is first changed is assumed to be the second ruled line detection processing result Y _LB.

第一行累積画素数の変化を検出しないまま位置Y_LS2か
らY_B1までの第一行累積画素数を全て読出し終えたとき
には、第二のけ線検出処理結果Y_LBを端縁位置Y_B1とす
る。When all the first row accumulated pixel numbers from positions Y _LS2 to Y _{B1 have} been read out without detecting a change in the first row accumulated pixel number, the second line detection processing result Y _{LB is set} to the edge position Y _B1 . I do.

また罫線検出部32は、第二罫線検出領域の所定の列方
向における一方の側の端縁位置をY_T1とし、第二罫線検
出領域内の文字行余白部分の位置Y_RS1から第二罫線検出
領域の一方の側の端縁位置Y_T1へＹを順に小さくしてゆ
き第二行累積画素数を読出し（但しY_RS1＞Y_T1であってY
_RS1からY_T1までの間に罫線及び余白が含まれるようにY
_RS1を設定する。例えばY_RS1＝Y_T2とすればよい。）、第
二行累積画素数の変化を調べ、変化を調べた結果に応じ
た第三の検出処理結果Y_RTを上述の第一罫線検出領域に
おける場合と同様にして得る。さらに罫線検出部３は第
二罫線検出領域の他方の側の端縁位置をY_B1とし、第二
罫線領域内の文字行余白部分の位置Y_RS2から第二罫線領
域の他方の側の端縁位置Y_B1へＹを順に大きくしてゆき
第二行累積画素数を読出し（但しY_RS2＜Y_B1であってY
_RS2からY_B1までの間に罫線及び余白を含むようにY_RS2を
設定する。例えばY_RS2＝Y_B2とすればよい。）、第二行
累積画素数の変化を調べ、変化を調べた結果に応じた第
四の検出処理結果Y_RBを上述の第一罫線検出領域におけ
る場合と同様にして得る。The ruled line detection unit 32, the second ruled line detection area of the edge positions of one side in a given column direction and Y _T1, the second ruled lines detected from the position Y _RS1 character line margin of the region second border detection Y is sequentially reduced to the edge position Y _T1 on one side of the area, and the second row accumulated pixel number is read out (where Y _RS1 > Y _T1 and Y
Y so that ruled lines and margins are included between _RS1 and Y _T1
Set _RS1 . For example, Y _RS1 = Y _T2 may be set. ), Examine the change in the second row cumulative number of pixels, the third detection processing result Y _RT corresponding to the result of investigating changes obtained in the same manner as in the first ruled line detection region described above. Further, the ruled line detection unit 3 sets the edge position on the other side of the second ruled line detection area to Y _B1, and sets the edge on the other side of the second ruled line area from the position Y _RS2 of the character line margin in the second ruled line area. Y is sequentially increased to the position Y _B1 and the cumulative number of pixels in the second row is read out (provided that Y _RS2 <Y _B1 and Y
To include ruled lines and blank between the _RS2 to Y _B1 to set the Y _RS2. For example, Y _RS2 = Y _B2 may be set. ), A change in the cumulative number of pixels in the second row is checked, and a fourth detection processing result Y _RB corresponding to the result of checking the change is obtained in the same manner as in the above-described first ruled line detection area.

第一及び第二行累積画素数の変化が現れたことは罫線
が検出されたとみなすことができ、これら行累積画素数
が変化したときの検出処理結果Y_LT、Y_LB、Y_RT又はY
_RBは、罫線の文字部領域側の端縁位置を示すとみなせ
る。また検出処理結果Y_LT＝Y_B1、Y_LB＝Y_B1、Y_RT＝Y_T1又
はY_RB＝Y_B1のときは、Y_T1≦Ｙ≦Y_LS1の範囲の第一罫線
検出領域、Y_LS2≦Ｙ≦Y_B1の範囲の第一罫線検出領域、Y
_T1≦Ｙ≦Y_RS1の範囲の第二罫線検出領域、又はY_RS2≦Ｙ
≦Y_B1の範囲の第二罫線検出領域において罫線が検出さ
れなかったとみなせる。The appearance of the change in the first and second row cumulative pixel numbers can be regarded as a ruled line being detected, and the detection processing result Y _LT , Y _LB , Y _RT or Y Y when these row cumulative pixel numbers have changed.
_RB can be regarded as indicating the edge position of the ruled line on the character portion area side. When the detection processing results Y _LT = Y _B1 , Y _LB = Y _B1 , Y _RT = Y _T1 or Y _RB = Y _B1 , the first ruled line detection area in the range of Y _T1 ≦ Y ≦ Y _LS1 , Y _LS2 ≦ First ruled line detection area in the range of Y ≦ Y _B1 , Y
Second ruled line detection area in the range of _T1 ≤ Y ≤ Y _RS1 , or Y _RS2 ≤ Y
It can be considered that no ruled line was detected in the second ruled line detection area in the range of ≦ Y _B1 .

罫線分離位置決定部34は、例えば次の判定式（７）及
び（８）に従って、列方向始端位置Y_T及び列方向終端位
置Y_Bを罫線の位置よりも文字部領域側に設定する。Borders separating position determining section 34, for example, in accordance with the following judgment formula (7) and (8), sets the column direction starting end position Y _T and the column end position Y _B in the character region side than the position of the border.

・Y_LT≧Y_RTのとき Y_T＝Y_LT ・Y_LT＜Y_RTのとき Y_T＝Y_RT ……（７）・Y_LB≧Y_RBのとき Y_B＝Y_RB ・Y_LB＜Y_RBのとき Y_B＝Y_LB ……（８）（制御部）制御部18は、読取部10にセッティングされた記録媒体
の書式に応じて、罫線付文字行領域の所定の列方向にお
ける端縁位置Y_T1、Y_B1と所定の行方向における端縁位置
X_S、X_Eとを書式情報格納部（図示せず）から読み込む。
そして読み込んだ位置Y_T1、Y_B1及び位置X_S、X_Eを罫線検
出領域設定部14に対し出力する。罫線付文字行領域は位
置Y_T1からY_B1までの間の領域であってかつ位置X_SからX_E
までの間の領域となる。• When Y _LT ≥ Y _RT Y _T = Y _LT · When Y _LT <Y _RT Y _T = Y _RT ... (7) • When Y _LB ≥ Y _RB Y _B = Y _RB · Y _LB <Y _RB In the case of Y _B = Y _LB (8) (Control Unit) The control unit 18 determines the edge position in the predetermined column direction of the character line area with ruled lines according to the format of the recording medium set in the reading unit 10. Y _T1 , Y _B1 and the edge position in the predetermined row direction
Load X _S, and X _E format information storage unit (not shown).
The read positions Y _T1 and Y _B1 and the positions X _S and X _E are output to the ruled line detection area setting unit. Character line region with borders an area between the position Y _T1 to Y _B1 with and position X _S from X _E
It is an area between.

尚、位置Y_T1、Y_B1及び位置X_S、X_Eとして書式情報格納
部に予め格納されたもののみならず、従来公知の行検出
手段によって記録媒体の一部或いは全体の画像データを
走査して検出された文字行の位置Y_T1、Y_B1及び位置X_S、
X_Eを用いるようにしてもよい。In addition, not only those previously stored in the format information storage unit as the positions Y _T1 and Y _B1 and the positions X _S and X _E but also a part or the entire image data of the recording medium is scanned by a conventionally known line detecting unit. Position Y _T1 , Y _B1 and position X _S ,
It may be used X _E.

また制御部18は罫線検出領域設定部14が検出した文字
の位置X_L及びX_Rと、罫線検出領域の位置X_LL及びX_LRと、
位置X_RL及びX_RRとを入力し、これら位置を罫線付文字行
領域内の個々の文字別に保存する。The position X _L and X _R of character control unit 18 which borders the detection area setting unit 14 detects a position X _LL and X _LR Border Detection area,
The positions X _RL and X _RR are input, and these positions are stored for each individual character in the ruled character line area.

そして制御部18は、罫線検出領域の位置X_LL、X_LRと位
置X_RL、X_RRとを、文字罫線分離部16に対して個々の文字
別に出力する。Then, the control unit 18 outputs the position X _LL , X _LR and the position X _RL , X _RR of the ruled line detection area to the character ruled line separating unit 16 for each character.

文字罫線分離部16が一つの文字に関して文字の位置
Y_T、Y_Bの検出を終えると、制御部18は文字罫線分離部16
から位置Y_T、Y_Bを入力して保存し保存した位置Y_T、Y
_Bと、当該位置Y_T、Y_Bを得た文字の位置X_L、X_Rとを対応
付けて文字切出情報として保存する。これと共に制御部
18は位置Y_T、Y_Bを入力すると次の一文字に関する罫線検
出領域の位置X_LL、X_LRと、位置X_RL、X_RRとを文字罫線分
離部16に対して出力する。Character line separation unit 16 determines the position of the character with respect to one character
When the detection of Y _T and Y _B is completed, the control unit 18 returns to the character ruled line separation unit 16.
Enter and save the position Y _T , Y _B from the saved position Y _T , Y
_B and the positions X _L and X _{R of} the characters from which the positions Y _T and Y _B are obtained are stored as character cutout information. Control unit with this
18 inputs the positions Y _T and Y _B and outputs the positions X _LL and X _LR and the positions X _RL and X _RR of the ruled line detection area for the next character to the character ruled line separation unit 16.

制御部18は文字罫線分離部16が一文字の文字のY_T、Y_B
の検出を終えると、当該一文字の文字のY_T、Y_Bと、X_L、
X_Rとを出力してもよいし、或は分離部16が文字行領域内
の全て又は任意好適個数の文字のY_T、Y_Bの検出を終えた
ら検出済の文字のY_T、Y_B、X_L、X_Rを各文字別に出力して
もよい。The control unit 18 determines that the character ruled line separation unit 16 has one character Y _T , Y _B
Is detected, the characters Y _T , Y _B , _XL ,
May output the X _R, or letters Y _T of the separating portion 16 all or any suitable number of character lines in the region of detection already character When finished detection of Y _{_B} Y _T, Y _B , X _L and X _R may be output for each character.

動作の説明次にこの発明の理解を深めるために、この実施例の動
作につきより具体的な動作の一例を挙げて説明する。
尚、以下の説明では文字行１行の文字切出し動作につき
説明するが、この発明が複数行の文字行の文字切出しに
も適用できることは明らかである。Description of Operation Next, in order to deepen the understanding of the present invention, the operation of this embodiment will be described using a more specific example of operation.
In the following description, a character extracting operation for one character line will be described. However, it is apparent that the present invention can be applied to character extracting for a plurality of character lines.

制御部18に入力部（図示せず）を介して処理開始信号
を入力すると、制御部18は読取部10に読取開始信号を出
力すると共に書式情報格納部から罫線付文字行領域の端
縁位置Y_T1、Y_B1、X_S、X_Eを入力しこれら位置Y_T1、Y_B1、
X_S、X_Eを第一行周辺分布作成部20に出力する。第一行周
辺分布作成部20は、罫線検出領域設定部14の動作の開始
に備えて、入力した端縁位置Y_T1、Y_B1と式（１）及び
（２）とから文字部領域の所定の列方向における端縁位
置Y_T2及びY_B2を設定すると共に文字部領域の所定の行方
向における一方の端縁位置を例えばX_S及び他方の側の端
縁位置を例えばX_Eと設定する。When a processing start signal is input to the control unit 18 via an input unit (not shown), the control unit 18 outputs a reading start signal to the reading unit 10 and outputs the edge position of the character line area with ruled lines from the format information storage unit. Enter Y _T1 , Y _B1 , X _S , X _E and enter these positions Y _T1 , Y _B1 ,
X _S, and outputs the X _E in the first row peripheral distribution generator 20. In preparation for the start of the operation of the ruled line detection area setting section 14, the first line peripheral distribution creating section 20 determines a character area from the input edge positions Y _T1 and Y _B1 and the equations (1) and (2). to the edge position Y _T2 and the one end edge positions, for example, X _S and the other an edge position of the side for example X _E in a given row of a character area and sets the Y _B2 in the row direction setting.

読取部10は読取開始信号を入力すると記録媒体の所定
の読取範囲内の光学的読取りを開始し、画像データを画
素単位に出力する。記録媒体の地を構成する白画素を例
えば「０」及び文字を構成する黒画素を例えば「１」と
して画像データを出力する。画像データ記憶部12は読取
部10からの画像データを走査順次に格納する。When a reading start signal is input, the reading unit 10 starts optical reading within a predetermined reading range of the recording medium, and outputs image data in pixel units. Image data is output with white pixels constituting the ground of the recording medium being, for example, "0" and black pixels constituting the character being, for example, "1". The image data storage unit 12 stores the image data from the reading unit 10 in a scanning order.

所定の読取範囲の全面の読取りが終り従って読取範囲
の全面の画像データの格納が終了すると、罫線検出領域
設定部14は動作を開始する。以下、罫線検出領域設定部
14の動作につき第２図及び第３図を参照して説明する。When reading of the entire surface of the predetermined reading range is completed and storage of image data of the entire surface of the reading range is completed, the ruled line detection area setting unit 14 starts operating. Hereinafter, the ruled line detection area setting section
The operation 14 will be described with reference to FIGS. 2 and 3.

第２図（Ａ）及び（Ｂ）は画像データ及び列累積画素
数の分布の一例を示す図である。第２図（Ａ）は記録媒
体の所定の読取範囲36内に含まれる文字行１行の画像デ
ータを示し、同図においてＸ軸及びＹ軸は画像メモリ上
に設定したＸ−Ｙ座標系の座標軸を表す。第２図（Ｂ）
は第２図（Ａ）に示す文字行につき検出される列累積画
素数の分布を示し、同図の横軸は所定の行方向における
位置Ｘを及び縦軸は位置Ｘにおける列累積黒画素数を表
す。第３図は罫線検出領域設定部の動作の流れの一例を
示す図である。尚、第２図（Ａ）において一点鎖線で囲
み符号40を付して示す領域は罫線付文字行領域の一例を
示す。FIGS. 2A and 2B are diagrams showing an example of the distribution of the image data and the column cumulative pixel number. FIG. 2A shows image data of one character line included in a predetermined reading range 36 of the recording medium. In FIG. 2A, the X axis and the Y axis are in the XY coordinate system set on the image memory. Represents a coordinate axis. Fig. 2 (B)
Shows the distribution of the number of column accumulated pixels detected for the character row shown in FIG. 2A, where the horizontal axis represents the position X in a predetermined row direction and the vertical axis represents the column accumulated black pixel number at the position X. Represents FIG. 3 is a diagram showing an example of the operation flow of the ruled line detection area setting unit. In FIG. 2A, a region surrounded by a dashed line and denoted by reference numeral 40 is an example of a character line region with a ruled line.

罫線検出領域設定部14が動作を開始すると、第一周辺
分布作成部20が文字部領域38（第２図（Ａ）中に二転鎖
線で囲んで示す領域）の列累積画素数の検出を開始す
る。When the ruled line detection area setting section 14 starts operation, the first marginal distribution creating section 20 detects the column accumulated pixel number of the character section area 38 (the area surrounded by the two-dot chain line in FIG. 2A). Start.

第一行周辺分布作成部20は、文字部領域内における主
走査位置ＸをX_Sとし（S1）、位置Ｘにおいて画像データ
を所定の列方向に走査してY_T2からY_B2までの間の黒画素
数を計数しよって位置Ｘにおける列累積画素数HL（Ｘ）
を検出する（S2）。そして検出した列累積画素数HL
（Ｘ）を当該位置ＸのHL（Ｘ）として第一周辺分布記憶
部22に保存する。The first row peripheral distribution generator 20, the main scanning position X in the character portion area and X _S (S1), between the scans the image data in a predetermined row direction from Y _T2 at position X to Y _B2 By counting the number of black pixels, the column accumulated pixel number HL (X) at position X
Is detected (S2). And the detected column cumulative pixel number HL
(X) is stored in the first peripheral distribution storage unit 22 as HL (X) of the position X.

列累積画素数HL（Ｘ）は次式（９）によって表せる。 The column accumulated pixel number HL (X) can be expressed by the following equation (9).

但し、Ｐ（Ｘ、Ｙ）は画像データを表し、Ｐ（Ｘ、
Ｙ）＝１は黒画素を及びＰ（Ｘ、Ｙ）＝０は白画素を表
す。 Here, P (X, Y) represents image data, and P (X, Y)
Y) = 1 represents a black pixel and P (X, Y) = 0 represents a white pixel.

次いで第一周辺分布作成部20は文字部領域38内の全て
の位置Ｘにおける列累積画素数HL（Ｘ）を検出したか否
かを判定する（S3）。位置Ｘ＝X_Eでなければ次の位置Ｘ
における列累積画素数を検出すべく位置Ｘに１加算し
（S4）、次いでＳに戻る。位置Ｘ＝X_Eであれば全ての位
置Ｘにおける列累積画素数HL（Ｘ）を検出したものとみ
なして列累積画素数の検出を終える。Next, the first marginal distribution creating unit 20 determines whether or not the column accumulated pixel number HL (X) at all positions X in the character portion area 38 has been detected (S3). Position X = X _E unless the next position X
In step S4, 1 is added to the position X in order to detect the column accumulated pixel number in step S4. Position X = assumes that detects a column cumulative pixel number HL (X) in the X all positions if X _E finishes detecting column cumulative number of pixels.

列累積画素数の検出が終了すると黒ブロック検出部24
は行方向始端位置X_L及び終端位置X_Rの検出を開始し、文
字の検出個数ｎ及びHL（Ｘ）の読出し位置Ｘを初期値１
及び初期値X_Sに初期化すると共に位置X_Sのひとつ前の位
置X_S−１における列累積画素数HL（X_S−１）を仮想的に
HL（X_S−１）＝０とする（S5）。次いで位置Ｘの列累積
画素数HL（Ｘ）を第一周辺分布記憶部22から読出し、読
出したHL（Ｘ）を所定の閾値THL（例えばTHL＝１）と比
較して当該HL（Ｘ）は黒ブロック要素か否かを判定する
（S6）。HL（Ｘ）≧THLとなるHL（Ｘ）は文字を構成す
る黒ブロック要素であるとみなし、またHL（Ｘ）＜THL
となるHL（Ｘ）を記録媒体の地を構成する白ブロック要
素であるとみなす。閾値THLを任意好適な値に設定する
ことによって、画像データ中のノイズを累積したHL
（Ｘ）を黒ブロック要素として検出する機会を減らすこ
とができる。When the detection of the column accumulated pixel number is completed, the black block detection unit 24
Starts detection of the line end position X _L and end position X _R , and sets the number of detected characters n and the read position X of HL (X) to the initial value 1
And the position X _S of the previous position X _S columns in -1 accumulated pixel number HL (X _S -1) is initialized to an initial value X _S virtually
HL and _{(X S -1) = 0 (} S5). Next, the column accumulated pixel number HL (X) at the position X is read from the first peripheral distribution storage unit 22, and the read HL (X) is compared with a predetermined threshold value THL (eg, THL = 1). It is determined whether or not it is a black block element (S6). HL (X) satisfying HL (X) ≧ THL is regarded as a black block element constituting a character, and HL (X) <THL
HL (X) is regarded as a white block element constituting the ground of the recording medium. By setting the threshold value THL to any suitable value, the HL that accumulates noise in the image data
The opportunity to detect (X) as a black block element can be reduced.

黒ブロック検出部24はS6で黒ブロック要素を検出した
場合には当該位置Ｘのひとつ前の位置Ｘ−１の列累積画
素数HL（Ｘ−１）を第一周辺分布記憶部22から読み込
み、読出したHL（Ｘ−１）を閾値THLと比較しHL（Ｘ−
１）が白ブロック要素であるか否かを判定する（S7）。
S7での判定結果が白ブロック要素であった場合には位置
Ｘ−１において白ブロック要素及び位置Ｘにおいて黒ブ
ロック要素を検出したことを表すので、所定の行方向に
おけるｎ個目の文字の始端位置X_L（ｎ）として当該位置
Ｘを保存し（S8）、S9で行なう。S7での判定結果が黒ブ
ロック要素であった場合には、S8を行なわずにS9を行な
う。When a black block element is detected in S6, the black block detection unit 24 reads the column accumulated pixel number HL (X-1) of the position X-1 immediately before the position X from the first peripheral distribution storage unit 22, The read HL (X-1) is compared with the threshold value THL, and HL (X-
It is determined whether 1) is a white block element (S7).
If the result of the determination in S7 is a white block element, it indicates that a white block element has been detected at position X-1 and a black block element has been detected at position X, and therefore the start of the n-th character in the predetermined line direction The position X is stored as the position _XL (n) (S8), and the operation is performed in S9. If the determination result in S7 is a black block element, S9 is performed without performing S8.

また黒ブロック検出部24はS6で白ブロック要素を検出
した場合には当該位置Ｘのひとつ前の位置Ｘ−１の列累
積画素数HL（Ｘ−１）を第一周辺分布記憶部22から読み
込み、読出したHL（Ｘ−１）を閾値THLと比較しHL（Ｘ
−１）が黒ブロック要素であるか否かを判定する（S1
0）。S10での判定結果が黒ブロック要素であった場合に
は位置Ｘ−１において黒ブロック要素及び位置Ｘにおい
て白ブロック要素を検出したことを表すので、所定の行
方向におけるｎ番目の文字の終端位置X_R（ｎ）として当
該位置Ｘを保存し（S11）、次の文字の位置X_L（ｎ）及
びX_R（ｎ）を検出するためにｎに１加算し（S12）、次
いでS9を行なう。S10での判定結果が白ブロック要素で
あった場合にはS11、S12で行なわずにS9を行なう。When the white block element is detected in S6, the black block detection unit 24 reads the column accumulated pixel number HL (X-1) of the position X-1 immediately before the position X from the first peripheral distribution storage unit 22. HL (X-1) is compared with the threshold value THL, and HL (X
It is determined whether -1) is a black block element (S1).
0). If the result of the determination in S10 is a black block element, it indicates that a black block element has been detected at position X-1 and a white block element has been detected at position X. Therefore, the end position of the n-th character in the predetermined line direction The position X is stored as X _R (n) (S11), and 1 is added to n to detect the positions X _L (n) and X _R (n) of the next character (S12), and then S9 is performed. . If the determination result in S10 is a white block element, S9 is performed without performing S11 and S12.

黒ブロック検出部24はS9で、文字部領域38内の全ての
列累積画素数HL（Ｘ）を読み込んだか否かを判定する。
S9においてＸ＝X_Eでなければ全ての列累積画素数HL
（Ｘ）の読み込みを終えていないので、次の位置ＸのHL
（Ｘ）を読み込むべくＸに１加算し（S13）、次いでS6
に戻る。またS9においてＸ＝X_Eであれば全ての列累積画
素数HL（Ｘ）を読み込んだので、黒ブロック検出部24は
文字部領域38内の全ての文字につき行方向始端位置X_L及
び終端位置X_Rの検出を終了する。In S9, the black block detection unit 24 determines whether or not all the column accumulated pixel numbers HL (X) in the character portion area 38 have been read.
S9 in X = X _E unless all columns accumulated pixel number HL
Since the reading of (X) has not been completed, the HL at the next position X
To read (X), add 1 to X (S13), then S6
Return to Since read all columns accumulated pixel number HL (X) if X = X _E in S9, the black block detection unit 24 is the row direction starting end position X _L and the end position for all the characters in the character area 38 to end the detection of X _R.

位置X_L及びX_Rの検出が終了すると、検出領域位置決定
部26は第一及び第二罫線検出領域の設定を開始する。検
出領域位置決定部26は文字部領域38内のＸが次第に大き
くなる方へ教えてｎ番目の文字の行方向始端位置X
_L（ｎ）及び終端位置X_R（ｎ）を黒ブロック検出部24か
ら読み込み、読み込んだ位置X_L（ｎ）、X_R（ｎ）と式
（３）〜（６）とから、ｎ番目の文字の第一罫線検出領
域の端縁位置X_LL（ｎ）、X_LR（ｎ）と第二罫線検出領域
の端縁位置X_RL（ｎ）、X_RR（ｎ）とを設定して罫線検出
領域の設定を行ない（S14）、これら位置X_L（ｎ）、X_R
（ｎ）と位置X_LL（ｎ）、X_LR（ｎ）、X_RL（ｎ）、X
_RR（ｎ）とを制御部18に対して出力する。制御部18は１
番目からn_LS番目までの文字の行方向始端及び終端位置
と、第一及び第二罫線検出領域の端縁位置とを入力し各
文字別に保存する。When the detection of the position X _L and X _R is terminated, the detection area position determining unit 26 starts the setting of the first and second ruled line detection area. The detection area position determination unit 26 informs the X direction in the character area 38 that the size of the X becomes gradually larger.
_L (n) and the end position X _R (n) is read from the black block detection unit 24, the read position X _L (n), from the X _R (n) Equation (3) - and (6), n th The edge position X _LL (n), X _LR (n) of the first ruled line detection area of the character and the edge position X _RL (n), X _RR (n) of the second ruled line detection area are set to detect the ruled line. The area is set (S14), and these positions X _L (n) and X _R
(N) and position X _LL (n), X _LR (n), X _RL (n), X
_RR (n) is output to the control unit 18. The control unit 18 is 1
The start and end positions in the row direction of the nth to _nLSth characters and the edge positions of the first and second ruled line detection areas are input and saved for each character.

検出領域位置決定部26がn_LS番目の文字の行方向始端
位置及び終端位置と、第一及び第二罫線検出領域の端縁
位置とを出力すると、罫線検出領域設定部14は動作を終
了する。When the detection area position determination unit 26 outputs the line direction start and end positions of the n _LS- th character and the edge positions of the first and second ruled line detection areas, the ruled line detection area setting unit 14 ends the operation. .

罫線検出領域設定部14が動作を終了すると、制御部18
は行方向始端位置及び終端位置と第一及び第二罫線検出
領域の端縁位置とを、１番目の文字からn_LS番目の文字
まで一文字毎に文字罫線分離部16に出力する。文字罫線
分離部16は一文字分の行方向始端位置及び終端位置と第
一及び第二罫線検出領域の端縁位置とを入力し、これら
位置に基づいてこれら位置を検出した一文字の文字の列
方向始端位置及び終端位置を検出し制御部18に出力す
る。制御部18は文字罫線分離部16から一文字分の列方向
始端及び終端位置を入力すると、入力した列方向始端位
置及び終端位置と当該位置を検出した文字の行方向始端
位置及び終端位置とを文字切出情報として出力すると共
に、次の一文字分の行方向始端位置及び終端位置と第一
及び第二罫線検出領域の端縁位置とを文字罫線分離部16
に対し出力する。When the ruled line detection area setting unit 14 ends the operation, the control unit 18
Outputs the start and end positions in the row direction and the edge positions of the first and second ruled line detection areas to the character ruled line separation unit 16 for each character from the first character to the _nLSth character. The character ruled line separation unit 16 inputs the starting position and the ending position of one character in the line direction and the edge positions of the first and second ruled line detection areas, and based on these positions, detects the character column direction of one character. The start position and the end position are detected and output to the control unit 18. When the control unit 18 inputs the start position and end position of one character in the column direction from the character ruled line separation unit 16, the control unit 18 determines the input start position and end position of the input column direction and the start position and end position of the detected character in the row direction. In addition to outputting the cut-out information, the start and end positions of the next character in the line direction and the end positions of the first and second ruled line detection areas are determined by the character ruled line separating unit 16.
Output to

制御部18はn_LS番目の文字の列方向始端位置及び終端
位置を文字罫線分離部16から入力すると、当該n_LS番目
の文字の列方向始端及び終端位置と行方向始端及び終端
位置とを出力したのち１行の文字行に関する文字切出し
を終了する。The control unit 18 by entering the column direction starting end position and the end position of the n _LS th character from the character ruled line separating section 16, the n _LS-th character string direction start and end positions in the row direction beginning and end positions and the output of After that, the character extraction for one character line ends.

以下、第４図及び第５図を参照し、文字罫線分離部16
の動作例につき説明する。Hereinafter, with reference to FIG. 4 and FIG.
An example of the operation will be described.

第４図（Ａ）は主として一文字分の画像データを示す
図、第４図（Ｂ）及び（Ｃ）は第４図（Ａ）に示す画像
データの第一行累積画素数及び第二行累積画素数の分布
を示す図である。第５図（Ａ）及び（Ｂ）は文字罫線分
離部の動作の流れの一例を示す図である。FIG. 4 (A) is a diagram mainly showing image data of one character, and FIGS. 4 (B) and (C) are the first row cumulative pixel count and the second row cumulative of the image data shown in FIG. 4 (A). It is a figure showing distribution of the number of pixels. FIGS. 5A and 5B are diagrams showing an example of the operation flow of the character ruled line separation unit.

文字罫線分離部16は、制御部18から行方向始端位置及
び終端位置X_L（ｎ）及びX_R（ｎ）と、第一罫線検出領域
の端縁位置X_LL（ｎ）、X_LR（ｎ）と、第二罫線検出領域
の端縁位置X_RL（ｎ）、X_RR（ｎ）とを入力すると、第ｎ
番目の文字の列方向始端位置及び終端位置Y_T及びY_Bの検
出を開始し、まず第二周辺分布作成部28が当該ｎ番目の
文字の第一罫線検出領域42内の行累積画素BLH（Ｙ）及
び第二罫線検出領域44内の行累積画素数BRH（Ｙ）の検
出を開始する。第４図において第一罫線検出領域42を三
点鎖線及び第二検出領域44を四点鎖線で示した。Character ruled line separating section 16, to the row direction starting end position and the end position X _L (n) and X _R (n) from the control unit 18, end edge positions X _LL of the first ruled line detection area (n), X _LR (n ) And the edge positions X _RL (n) and X _RR (n) of the second ruled line detection area, the
Th starts detection in the column direction starting end position and the end position Y _T and Y _B character, first the second peripheral distribution generator 28 is the n-th character of the first ruled line detection region 42 in line cumulative pixel BLH ( Y) and detection of the row cumulative pixel number BRH (Y) in the second ruled line detection area 44 are started. In FIG. 4, the first ruled line detection area 42 is indicated by a three-dot chain line, and the second detection area 44 is indicated by a four-dot chain line.

行累積画素数の検出を開始した第二周辺分布作成部28
は、所定の列方向における走査位置ＹをY_T1に設定し
（＊S1）、次に位置Ｙにおいて画像データを所定の行方
向に走査してX_LL（ｎ）からX_LR（ｎ）までの間の黒画素
数を計数して第一行累積画素数BLH（Ｙ）を検出し検出
したBLH（Ｙ）を当該位置Ｙにおける第一行累積画素数
として第二周辺分布記憶部30に記憶する。これと共に位
置Ｙにおいて画像データを走査してX_RL（ｎ）からX
_RR（ｎ）までの間の黒画素数を計数して第二行累積画素
数BRH（Ｙ）を検出し検出したBRH（Ｙ）を当該位置Ｙに
おける第二行累積画素数として第二周辺分布記憶部30に
保存する（＊S2）。Second marginal distribution creating unit 28 that has started detecting the row cumulative pixel number
Sets the scanning position Y in the predetermined column direction to Y _T1 (* S1), and then scans the image data in the predetermined row direction at the position Y to determine the position from X _LL (n) to X _LR (n). The number of black pixels between them is counted to detect the first row accumulated pixel number BLH (Y), and the detected BLH (Y) is stored in the second peripheral distribution storage unit 30 as the first row accumulated pixel number at the position Y. . At the same time, the image data is scanned at the position Y and X _RL (n) is
_The number of black pixels up to _RR (n) is counted, the second row cumulative pixel number BRH (Y) is detected, and the detected BRH (Y) is used as the second row cumulative pixel number at the position Y to obtain the second peripheral distribution. It is stored in the storage unit 30 (* S2).

行累積画素数BLH（Ｙ）及びBRH（Ｙ）は次式（10）及
び（11）によって表せる。The row accumulated pixel numbers BLH (Y) and BRH (Y) can be expressed by the following equations (10) and (11).

次いで第二周辺分布作成部28は罫線検出領域42及び44
内の全ての位置Ｙにおいて行累積画素数をを検出したか
否かを判定する（＊S3）。位置Ｙ＝Y_B1でなければ次の
位置Ｙにおける行累積画素数を検出すべく位置Ｙに１加
算し（＊S4）、次いで＊S2に戻る。位置Ｙ＝X_Eであれば
全ての位置Ｙにおける列累積画素数BLH（Ｙ）、BRH
（Ｙ）を検出したものとみなして行累積画素数の検出を
終える。 Next, the second margin distribution creating unit 28 sets the ruled line detection areas 42 and 44
It is determined whether or not the row accumulated pixel number has been detected at all positions Y in (* S3). If the position Y is not equal to Y _B1 , one is added to the position Y (* S4) in order to detect the number of accumulated pixels at the next position Y (* S4), and then the process returns to * S2. Position Y = X _E a long Invite column cumulative number of pixels in all positions Y BLH (Y), BRH
Assuming that (Y) has been detected, the detection of the row cumulative pixel number ends.

行累積画素数の検出が終了すると罫線検出部32は列方
向始端位置Y_T及び列終端位置Y_Bの検出を開始し、BLH
（Ｙ）の読出し位置ＹをY_LS1としてのY_T2に設定する
（＊S5）。次いで位置Ｙの列累積画素数BLH（Ｙ）を第
二周辺分布記憶部30から読出し、読出したBLH（Ｙ）を
所定の閾値THL_Lと比較して当該BLH（Ｙ）は黒ブロック
要素か否かを判定する（＊S6）。Line the accumulated pixel number of the detection is completed ruled line detecting unit 32 starts the detection of the column direction starting end position Y _T and the column end position Y _B, BLH
Reading position Y (Y) is set to Y _T2 as Y _LS1 (* S5). Then the Compared position Y column cumulative number of pixels BLH a (Y) from the second peripheral distribution storage unit 30 reads, the read BLH a (Y) with a predetermined threshold value THL _L BLH (Y) is whether black block elements Is determined (* S6).

THL_Lの値は任意好適な値とすることができ、この例で
は例えば次式（12）によってTHL_Lを設定する。The value of THL _L can be any suitable value. In this example, THL _L is set by, for example, the following equation (12).

罫線検出部32は＊S6で黒ブロック要素を検出した場合
には、罫線を検出したので当該黒ブロック要素の検出位
置Ｙを罫線検出処理結果Y_LTとして保存し（＊S7）、次
いで＊S10を行なう。＊S6で黒ブロック要素を検出しな
かった場合には、Y_LS1からY_T1までの全てのBLH（Ｙ）を
読出したか否かを判定する（＊S8）。＊S7でＹ＝Y_T1で
あれば罫線を検出せずに全てのBLH（Ｙ）を読出しを終
えたので当該位置Ｙ＝Y_T1を罫線検出処理結果Y_LTとして
保存し（＊S7）、次いで＊S10を行なう。＊S7でＹ＝Y_T1
でなければ次の位置ＹのBLH（Ｙ）を読出すべく位置Ｙ
から１減算し（＊S9）、次いで＊S6に戻る。 When detecting a black block element in * S6, the ruled line detection unit 32 detects the ruled line, and stores the detection position Y of the black block element as a ruled line detection processing result _YLT (* S7), and then * S10 Do. * If it does not detect the black block elements in S6, judges read Taka whether all BLH a (Y) from Y _LS1 to Y _T1 (* S8). If Y = Y _T1 in * S7, all the BLH (Y) have been read out without detecting the ruled line, and the position Y = Y _T1 is stored as the ruled line detection processing result Y _LT (* S7), and then * Perform S10. * Y = Y _{T1 at} S7
Otherwise, position Y to read BLH (Y) at next position Y
1 (* S9), and then returns to * S6.

罫線検出部32は＊S10で、BLH（Ｙ）の読出し位置Ｙを
Y_LS2としてのY_B2に設定する。次いで位置Ｙの列累積画
素数BLH（Ｙ）を第二周辺分布記憶部30から読出し、読
出しBLH（Ｙ）を閾値THL_Lと比較して当該BLH（Ｙ）は黒
ブロック要素か否かを判定する（＊S11）。The ruled line detection unit 32 sets the read position Y of BLH (Y) at * S10.
Set to Y _B2 as Y _LS2 . Then read position Y column cumulative number of pixels BLH a (Y) from the second peripheral distribution storage unit 30, the reading BLH a (Y) is compared with a threshold value THL _L BLH (Y) is determined whether the black block elements (* S11).

罫線検出部32は＊S11で黒ブロック要素を検出した場
合には、罫線を検出したので当該黒ブロック要素の検出
位置Ｙを罫線検出処理結果Y_LBとして保存し（＊S12）、
次いで＊S15を行なう。＊S11で黒ブロックを検出しなか
った場合には、Y_LS2からY_B1までの全てのBLH（Ｙ）を読
出したか否かを判定する（＊S13）。＊S13でＹ＝Y_B1で
あれば罫線を検出せずに全てのBLH（Ｙ）の読出しを終
えたので当該位置Ｙ＝Y_B1を罫線検出処理結果Y_LBとして
保存する（＊S12）、次いで＊S15を行なう。＊S13でＹ
＝Y_B1でなければ次の位置ＹのBLH（Ｙ）を読出すべく位
置Ｙに１加算し（＊S14）、次いで＊S11に戻る。When detecting a black block element in * S11, the ruled line detection unit 32 detects a ruled line, and stores the detection position Y of the black block element as a ruled line detection processing result _YLB (* S12).
Next, * S15 is performed. * If it does not detect the black block in S11, it determines read Taka whether all BLH a (Y) from Y _LS2 to Y _B1 (* S13). If Y = Y _B1 in * S13, reading of all BLH (Y) is completed without detecting a ruled line, and the position Y = Y _B1 is stored as a ruled line detection processing result _YLB (* S12). * Perform S15. * Y in S13
If not = Y _B1 , 1 is added to the position Y to read BLH (Y) at the next position Y (* S14), and then the process returns to * S11.

罫線検出部32は＊S15で、BRL（Ｙ）の読出し位置Ｙを
Y_RS1としてのY_T2Yに設定する。次いで位置Ｙの列累積画
素数BRL（Ｙ）を第二周辺分布記憶部30から読出し、読
出したBRH（Ｙ）を閾値THL_Rと比較して当該BRL（Ｙ）は
黒ブロック要素か否かを判定する（＊S16）。The ruled line detection unit 32 sets the read position Y of BRL (Y) in * S15.
Set to Y _T2 Y as Y _RS1 . Then read position Y column cumulative number of pixels BRL a (Y) from the second peripheral distribution storage unit 30, the BRL the read BRH a (Y) compared with a threshold value THL _R (Y) is whether black block elements Judge (* S16).

THL_Rの値は任意好適な値とすることができ、この例で
は次式（13）によってTHL_Rを設定する。The value of THL _R can be any suitable value. In this example, THL _R is set by the following equation (13).

罫線検出部32は＊S16で黒ブロック要素を検出した場
合には、当該黒ブロック要素の検出位置Ｙを罫線検出処
理結果Y_RBとして保存し（＊S17）、次いで＊S20を行な
う。＊S17で黒ブロック要素を検出しなかった場合に
は、Y_RS1からY_T1までの全てのBRH（Ｙ）を読出したか否
かを判定する（＊S18）。＊S18でＹ＝Y_T1であれば罫線
を検出せずに全てのBRH（Ｙ）の読出しを終えたので当
該位置Ｙ＝Y_T1を罫線検出処理結果Y_RTとして保存し（＊
S17）、次いで＊S20を行なう。＊S18でＹ＝Y_T1でなけれ
ば次の位置ＹのBRH（Ｙ）を読出すべく位置Ｙから１減
算し（＊S19）、次いで＊S16に戻る。 When detecting a black block element in * S16, the ruled line detection unit 32 stores the detected position Y of the black block element as a ruled line detection processing result Y _RB (* S17), and then performs * S20. If no black block element is detected in * S17, it is determined whether or not all BRH (Y) from Y _RS1 to Y _T1 have been read (* S18). * If Y = Y _T1 in S18, all the BRH (Y) have been read out without detecting the ruled line, and the position Y = Y _T1 is stored as the ruled line detection processing result Y _RT (*
S17), then perform * S20. Unless Y = Y _{T1 in} * S18, 1 is subtracted from the position Y to read the BRH (Y) at the next position Y (* S19), and then the process returns to * S16.

罫線検出部32は＊S20で、BRY（Ｙ）の読出し位置Ｙを
Y_RS2としてのY_B2に設定する。次いで位置Ｙの列累積画
素数BRH（Ｙ）を第二周辺分布記憶部30から読出し、読
出したBRH（Ｙ）を閾値THL_Rと比較して当該BRH（Ｙ）は
黒ブロック要素か否かを判定する（＊S21）。The ruled line detection unit 32 sets the read position Y of BRY (Y) at * S20.
Set to Y _B2 as Y _RS2 . Then read position Y column cumulative number of pixels BRH a (Y) from the second peripheral distribution storage unit 30, the BRH the read BRH a (Y) compared with a threshold value THL _R (Y) is whether black block elements It is determined (* S21).

罫線検出部32は＊S21で黒ブロック要素を検出した場
合には、当該黒ブロック要素の検出位置Ｙを罫線検出処
理結果Y_RBとして保存し（＊S22）、第ｎ番目の文字の罫
線検出処理結果として検出したY_LT、Y_LB、Y_RT及びY_RBを
罫線分離位置決定部34に対して出力し罫線検出処理を終
える。＊S21で黒ブロック要素を検出しなかった場合に
は、Y_RS2からY_B1までの全てのBRH（Ｙ）を読出したか否
かを判定する（＊S23）。＊S23でＹ＝Y_B1であれば罫線
を検出せずに全てのBRH（Ｙ）の読出しを終えたので当
該位置Ｙ＝Y_B1を罫線検出処理結果Y_RBとして保存し（＊
S22）、第ｎ番目の文字の罫線検出処理結果として検出
したY_LT、Y_LB、Y_RT及びY_RBを罫線分離位置決定部34に対
して出力し罫線検出処理を終える。＊S23でＹ＝Y_B1でな
ければ次の位置ＹのBRH（Ｙ）を読出すべき位置Ｙに１
加算し（＊S24）、次いで＊S21に戻る。When the ruled line detection unit 32 detects a black block element in * S21, the detection position Y of the black block element is stored as a ruled line detection processing result Y _RB (* S22), and the ruled line detection processing of the n-th character is performed. The detected Y _LT , Y _LB , Y _RT, and Y _RB are output to the ruled line separation position determination unit 34, and the ruled line detection process is completed. * If it does not detect the black block elements in S21, determines read Taka whether all BRH a (Y) from Y _RS2 to Y _B1 (* S23). * If Y = Y _B1 in S23, all the BRH (Y) have been read out without detecting the ruled line, and the position Y = Y _B1 is stored as the ruled line detection processing result Y _RB (*
S22), the n-th character in the ruled line detection processing result as the detected Y _LT, Y _LB, ends the output was ruled line detection process with respect to Y _RT and Y _RB borders separating position determining unit 34. * Unless Y = Y _{B1 in} S23, 1 is set to the position Y where BRH (Y) of the next position Y should be read.
Addition (* S24), and then return to * S21.

罫線分離位置決定部34は、罫線検出部32からの罫線検
出処理結果を入力すると、式（７）及び（８）に従っ
て、第ｎ番目の文字の列方向始端位置Y_T及び列方向終端
位置Y_Bを検出し（＊S25）、検出したY_T及びY_Bを制御部1
8に出力する。Upon receiving the ruled line detection processing result from the ruled line detection unit 32, the ruled line separation position determination unit 34, in accordance with equations (7) and (8), starts the column direction start position Y _T and column direction end position Y _T of the n-th character. detecting a _B (* S25), controls the detected Y _T and Y _B section 1
Output to 8.

例えば第４図に示す画像データからは、文字の列方向
始端位置Y_TとしてY_LTが及び列方向終端位置としてY_RBが
検出され、従って画像の真の行方向が所定の行方向から
ずれておりしかも文字46と罫線48とが接触している場合
でも、文字46と罫線48とを分離して切出すことが可能と
なる。For example, from the image data shown in FIG. 4, Y _RB is detected as Y _LT as the column direction starting end position Y _T character and column end position, thus the true row direction of the image deviates from a predetermined row direction Even when the character 46 and the ruled line 48 are in contact with each other, the character 46 and the ruled line 48 can be separated and cut out.

この発明は上述した実施例にのみ限定されるものでは
なく、従って各構成成分の動作の流れ、入出力信号、動
作のタイミング、接続関係及び構成を任意好適に変更す
ることができる。The present invention is not limited to the above-described embodiment. Therefore, the operation flow of each component, input / output signals, operation timing, connection relationship, and configuration can be arbitrarily and suitably changed.

例えば、上述した実施例では、横書き文書の画像デー
タを例に取って所定の行方向における文字始端及び終端
位置と所定の列方向における文字始端及び終端位置を検
出した例につき説明したが、この発明を縦書き文書に適
用できることは明らかである。For example, in the above-described embodiment, a description has been given of an example in which the character start and end positions in a predetermined line direction and the character start and end positions in a predetermined column direction are detected using image data of a horizontally written document as an example. Obviously, can be applied to vertically written documents.

第６図は縦書き文書の画像データの一例を示す図であ
る。例えば同図に示すように画像メモリ上にＸ−Ｙ座標
系を設定しＸ軸を所定の行方向とすれば、上述した実施
例と同様にして所定の行及び列方向文字における文字始
端位置及び終端位置を検出できる。FIG. 6 is a diagram showing an example of image data of a vertically written document. For example, as shown in the figure, if an XY coordinate system is set on the image memory and the X axis is set to a predetermined row direction, the character starting position in the predetermined row and column direction characters and The end position can be detected.

（発明の効果）上述した説明からも明らかなように、この発明の文字
切出し装置によれば、所定の行方向の文字始端位置及び
終端位置を文字部領域内における所定の列方向の累積黒
素数の変化に基づいて検出するので、これら始端及び終
端位置の検出処理を簡単化できる。しかも文字部領域
を、罫線を含まないように文字行１行の文字部分のみを
包含するように設定するので、罫線を構成する黒画素が
所定の列方向の累積黒画素数として計数されないように
或はほとんどわずかしか計数されないようにすることが
でき、この結果、これら始端及び終端位置の検出精度を
高めることができる。(Effects of the Invention) As is clear from the above description, according to the character extracting device of the present invention, the character start position and the end position in the predetermined line direction are determined by the cumulative number of black pixels in the character portion region in the predetermined column direction. , The detection process of the start and end positions can be simplified. In addition, since the character portion area is set so as to include only the character portion of one character line so as not to include the ruled line, the black pixels constituting the ruled line are not counted as the cumulative number of black pixels in the predetermined column direction. Alternatively, it is possible to count almost only a small amount, and as a result, it is possible to improve the detection accuracy of these start and end positions.

さらに所定の列方向の文字始端位置及び終端位置の検
出を、第一及び第二罫線検出領域内における所定の行方
向の累積黒画素数の変化に基づいて行ない従って累積黒
画素数の検出と閾値処理とから行なうので、これら始端
及び終端位置の検出処理を簡単化できる。Further, the detection of the character start position and the end position in the predetermined column direction is performed based on the change of the cumulative black pixel number in the predetermined row direction in the first and second ruled line detection areas. Since the processing is started from the processing, the detection processing of these start and end positions can be simplified.

しかも第一罫線検出領域を、所定の行方向の文字始端
位置に隣接する位置であって所定の行方向の文字終端位
置とは反対側の位置に罫線を含むように設定するので、
文字を構成する黒画素が第一罫線検出領域において所定
の行方向の累積黒画素数として計数されないように或は
ほとんど計数されないようにすることができる。その結
果、第一罫線検出領域内の罫線の位置を精度良く検出で
きる。同様に第二罫線検出領域を、所定の行方向の文字
終端位置と隣接する位置であって所定の行方向の文字始
端位置とは反対側の位置に罫線を含むように設定するの
で、文字を構成する黒画素が第二罫線検出領域における
所定の行方向の累積黒画素数として計数されないように
或はほとんど計数されないようにすることができる。そ
の結果、第二罫線検出領域内の罫線の位置を精度良く検
出できる。Moreover, since the first ruled line detection area is set to include a ruled line at a position adjacent to the character start position in the predetermined line direction and opposite to the character end position in the predetermined line direction,
It is possible to prevent black pixels constituting a character from being counted or hardly counted as the cumulative number of black pixels in the predetermined row direction in the first ruled line detection area. As a result, the position of the ruled line in the first ruled line detection area can be detected with high accuracy. Similarly, the second ruled line detection area is set to include a ruled line at a position adjacent to the character end position in the predetermined line direction and opposite to the character start position in the predetermined line direction. The constituent black pixels can be prevented from being counted or hardly counted as the cumulative number of black pixels in a predetermined row direction in the second ruled line detection area. As a result, the position of the ruled line in the second ruled line detection area can be detected with high accuracy.

従って第一及び第二罫線検出領域内における罫線の位
置を精度良く検出できるので、所定の行方向が画像の真
の行方向から多少ずれていても、及び又は、罫線と文字
とが接触していても、所定の列方向の文字始端位置及び
終端位置を文字の欠けを生じないように或は文字の欠け
を従来よりも少なくして文字を切出すことができる。Therefore, since the position of the ruled line in the first and second ruled line detection areas can be detected with high accuracy, even if the predetermined line direction is slightly shifted from the true line direction of the image, and / or the ruled line is in contact with the character. However, a character can be cut out at a character start position and an end position in a predetermined column direction so that the character is not chipped or the character is cut less than before.

また第一及び第二罫線検出領域内の罫線の検出を累積
黒画素数の変化に基づいて行ない従って累積黒画素数の
検出と閾値処理とから行なうので、これら領域内の罫線
検出を簡単な処理で行なえる。Further, since the detection of the ruled lines in the first and second ruled line detection areas is performed based on the change in the number of accumulated black pixels, the detection of the number of accumulated black pixels and the threshold processing are performed. Can be done with

従ってこの発明の切出し装置によれば、所定の行方向
が真の行方向からずれている場合でも簡単な処理で、所
定の行及び列方向における文字の始端及び終端位置を精
度良く検出できる装置を提供できる。しかも簡単な処理
なので、装置構成を小型化及び簡単化でき、従って低価
格な文字切出し装置を得ることができる。Therefore, according to the clipping device of the present invention, a device capable of accurately detecting the start and end positions of a character in a predetermined row and column direction by simple processing even when the predetermined row direction is deviated from the true row direction. Can be provided. In addition, since the processing is simple, the structure of the apparatus can be reduced in size and simplified, so that a low-cost character extracting apparatus can be obtained.

この発明を文字認識装置に適用すれば、例えば文字や
罫線が原稿に多少傾斜して印刷されている場合等に、読
取部が真の行方向からずれる方向に原稿を主走査したと
しても、文字を精度良く切出すことができ、従って精度
の良い文字認識が行なえる。If the present invention is applied to a character recognition device, for example, when characters or ruled lines are printed on the document with a slight inclination, even if the reading unit main-scans the document in a direction deviating from the true line direction, the character Can be extracted with high accuracy, and therefore, accurate character recognition can be performed.

[Brief description of the drawings]

第１図はこの発明の実施例の構成の一例を示す機能ブロ
ック図、第２図（Ａ）及び（Ｂ）は画像データ及び列累積黒画素
数の分布を示す図、第３図は罫線検出領域設定部の動作の流れの一例を示す
図、第４図（Ａ）は一文字の画像データの一例を示す図、第
４図（Ｂ）及び（Ｃ）は第一罫線検出領域内の行累積黒
画素数の分布の一例及び第二罫線検出領域内の行累積黒
画素数の分布の一例を示す図、第５図（Ａ）及び（Ｂ）は文字罫線分離部の動作の流れ
の一例を示す図、第６図は縦書き文書の画像データの一例を示す図であ
る。 10……読取部、12……画像データ記憶部 14……罫線検出領域設定部 16……文字罫線分離部 18……制御部 20……第一周辺分布作成部 22……第一周辺分布記憶部 24……黒ブロック検出部 26……検出領域位置決定部 28……第二周辺分布作成部 30……第二周辺分布記憶部 32……罫線検出部、34……罫線分離位置決定部。FIG. 1 is a functional block diagram showing an example of the configuration of an embodiment of the present invention. FIGS. 2 (A) and (B) are diagrams showing the distribution of image data and the number of accumulated black pixels in a column. FIG. 4A shows an example of one character image data, and FIGS. 4B and 4C show row accumulation in the first ruled line detection area. FIGS. 5A and 5B show an example of the distribution of the number of black pixels and an example of the distribution of the cumulative number of black pixels in the line in the second ruled line detection area. FIGS. FIG. 6 is a diagram showing an example of image data of a vertically written document. 10 reading unit 12 image data storage unit 14 ruled line detection area setting unit 16 character ruled line separation unit 18 control unit 20 first peripheral distribution creating unit 22 first peripheral distribution storage Unit 24 Black block detection unit 26 Detection area position determination unit 28 Second peripheral distribution creation unit 30 Second peripheral distribution storage unit 32 Ruled line detection unit 34 Ruled line separation position determination unit

フロントページの続き (58)調査した分野(Int.Cl.⁶，ＤＢ名) G06K 9/20 G06K 9/34Continued on the front page (58) Fields surveyed (Int.Cl. ⁶ , DB name) G06K 9/20 G06K 9/34

Claims

(57) [Claims]

A reading section for optically reading an information medium on which a character line with a ruled line is described and outputting image data of the information medium; an image data storage section for storing image data from the reading section; The area is set in the character line area with a ruled line so as not to include a ruled line, and the image data in the character part area of the image data storage unit is scanned, and the cumulative number of black pixels in a predetermined column direction is determined for each scanning line. Detecting, based on the change in the cumulative number of black pixels in the column direction, detects the character start position and end position in the predetermined line direction for each character, based on the character start position and end position in the line direction,
The first ruled line detection area is set for each character so as to include a ruled line at a position adjacent to the character start position in the predetermined line direction and opposite to the character end position in the predetermined line direction. A ruled line detection area setting unit that sets, for each character, a ruled line adjacent to the character end position in the predetermined line direction and opposite to the character start position in the predetermined line direction; and Scanning the image data in the first and second ruled line detection areas of the section to detect the cumulative number of black pixels in a predetermined row direction for each scanning line, and based on the change in the cumulative number of black pixels in the row direction, A character rule separation unit configured to perform a rule detection process in the second rule detection region for each character, and to set a character start position and an end position in the predetermined column direction for each character based on a result of the rule detection process; Character extraction device characterized by comprising .