JPH031712B2

JPH031712B2 -

Info

Publication number: JPH031712B2
Application number: JP57046749A
Authority: JP
Inventors: Kunio Sakai
Original assignee: Tokyo Shibaura Electric Co Ltd
Current assignee: Toshiba Corp
Priority date: 1982-03-24
Filing date: 1982-03-24
Publication date: 1991-01-11
Also published as: JPS58165178A

Description

【発明の詳細な説明】〔発明の技術分野〕本発明は漢字を含む多数の文字を読取り対象と
して、安定に且つ効率良く大分類識別の特徴を抽
出することのできる実用性の高い文字読取装置に
関する。[Detailed Description of the Invention] [Technical Field of the Invention] The present invention provides a highly practical character reading device that is capable of stably and efficiently extracting features for broad classification identification by reading a large number of characters including kanji. Regarding.

[Technical background of the invention]

入力文字パターンの特徴を抽出し、辞書に予め
登録されている辞書パターンの特徴との照合を行
い、その照合結果に従つて上記入力文字パターン
の読取り認識することが行われている。ところが
この文字認識において漢字を含む多数の文字を読
取り対象とする場合、上記特徴の照合処理が非常
に膨大な量となり、処理効率が著しく悪くなる。
そこで従来では、上記認識処理に先立つて認識対
象文字を複数の概略的な特徴によつて大分類し、
これによつて識別段階での照合文字数を低減して
処理効率の向上、処理速度の高速化を図ることが
行われている。この際、上記大分類に用いる特徴
と候補文字識別に用いる特徴とに同じものを用い
ることが必要であるが、この特徴は手書文字等に
起因する文字パターンの変形や種々の雑音に対し
て十分安定であることが必要である。 The features of an input character pattern are extracted and compared with the features of dictionary patterns registered in advance in a dictionary, and the input character pattern is read and recognized according to the result of the comparison. However, when a large number of characters including Chinese characters are to be read in this character recognition, the amount of processing required to match the above-mentioned features becomes extremely large, and the processing efficiency deteriorates significantly.
Therefore, conventionally, prior to the above recognition process, characters to be recognized are roughly classified based on a plurality of general characteristics.
This reduces the number of characters to be compared in the identification stage, thereby improving processing efficiency and speeding up the processing speed. At this time, it is necessary to use the same features as the features used for the above-mentioned major classification and the features used for candidate character identification, but this feature is necessary to prevent deformation of character patterns caused by handwritten characters and various noises. It needs to be sufficiently stable.

しかして、従来、このような文字パターンの大
分類に用いられる特徴としては、文字線の複雑さ
に着目したものや、文字周辺部の形状に着目した
もの、更には文字全体の粗い形状に着目したもの
等がある。尚、これらについては、例えば下記の
文献に詳しく紹介されている。 However, conventionally, the features used to broadly classify character patterns have focused on the complexity of the character lines, the shape of the peripheral part of the character, and even the rough shape of the entire character. There are things that have been done. In addition, these are introduced in detail in, for example, the following literature.

坂井，渡辺 “印刷漢字認識の現状” 情報処理、Vol.22No.４ PP.274〜279 （昭和56年４月）第１図はその一例を示すもので、原文字パター
ンの周囲の形状に着目し、その周辺部上下左右に
矩形状に設けた走査領域Ｕ，Ｄ，Ｌ，Ｒ内に存在
する文字線を検出し、その量を「０」，「１」，
「２」なる３段階のレベルに量子化してこれを特
徴データとするものである。また第２図に示すも
のは、原文字パターンに接する外接辺ｕ，ｄ，
ｌ，ｒに着目し、これらの外接辺ｕ，ｄ，ｌ，ｒ
に接する文字背景部の面積を求め、これを多次元
の特徴ベクトルとして利用するものである。前者
（第１図）は文字線部に着目するのに対し、後者
（第２図）は背景部面積に着目する点を異にして
いるが、いずれも文字パターンの周囲における面
積的数量を特徴としていると云える。 Sakai, Watanabe “Current status of printed kanji recognition” Information Processing, Vol.22 No.4 PP.274-279 (April 1981) Figure 1 shows an example, focusing on the shape around the original character pattern. Then, the character lines existing in the rectangular scanning areas U, D, L, and R on the top, bottom, left, and right of the surrounding area are detected, and the amount is set to "0", "1",
It is quantized into three levels of "2" and used as feature data. Also, what is shown in Fig. 2 is the circumscribed edges u, d, which are in contact with the original character pattern.
Focusing on l, r, these circumscribed edges u, d, l, r
The area of the character background that is in contact with is determined and used as a multidimensional feature vector. The difference is that the former (Figure 1) focuses on the character line area, while the latter (Figure 2) focuses on the area of the background area, but both are characterized by the area quantity around the character pattern. It can be said that this is true.

[Problems with background technology]

ところが、上記特徴抽出は、活字文字等のよう
にそのパターンが規格されている場合には非常に
有効であるが、手書文字のように文字としての特
徴を有しながらも大きく変形しているような場合
には極めて不安定である。しかも、例え活字文字
であつても、文字パターンに欠けやかすれが存在
する場合、その特徴抽出は甚だ不安定なものとな
る。換言すれば文字パターンの変形や種々の雑音
に対して常に安定に特徴抽出を行い得ないと云う
問題を有している。 However, the feature extraction described above is very effective in cases where the pattern is standardized, such as in printed characters, but it is very effective in cases where the pattern is standardized, such as in printed characters, but the pattern is significantly deformed even though it has characteristics as a character, such as in handwritten characters. In such cases, it is extremely unstable. Furthermore, even if the characters are printed, if the character pattern is chipped or faded, feature extraction becomes extremely unstable. In other words, there is a problem in that feature extraction cannot always be performed stably against deformation of character patterns and various noises.

[Purpose of the invention]

本発明はこのような事情に考慮してなされたも
ので、その目的とするところは、手書きされた漢
字を含む多数の文字を読取り対象として、入力さ
れた原文字パターンの大分類識別に必要な特徴を
安定に且つ効率良く抽出することのできる実用性
の高い文字読取り装置を提供することにある。 The present invention has been developed in consideration of these circumstances, and its purpose is to read a large number of characters, including handwritten kanji, and to identify the major classifications of input original character patterns. It is an object of the present invention to provide a highly practical character reading device that can extract features stably and efficiently.

[Summary of the invention]

本発明は文字の大分類識別に必要な分類情報
（特徴）が文字パターンの周囲部に多く存在する
ことに着目し、上記文字パターンの周囲部におい
て、文字線部と背景部の境界線長をその入力され
た原文字パターンの各外接辺にそれぞれ対応して
求め、これらの境界線の長さの外接枠の長さに対
する相対値から文字パターンの平均的凹凸量を求
めてこれを大分類識別の特徴として抽出するよう
にした文字読取り装置に係わる。 The present invention focuses on the fact that much of the classification information (features) necessary for character classification identification exists in the periphery of the character pattern, and calculates the length of the boundary line between the character line part and the background part in the periphery of the character pattern. The input original character pattern is calculated for each circumscribing side, and the average amount of unevenness of the character pattern is calculated from the relative value of the length of these border lines with respect to the length of the circumscribing frame, and this is classified into major categories. The present invention relates to a character reading device that extracts characteristics of characters.

〔Effect of the invention〕

本発明によれば、入力された文字パターンに接
する外接辺によつて囲まれる上記文字パターンの
各外接辺に対応した文字線部と背景部の境界線長
と云う新規な概念、すなわち、文字周囲部の凹凸
度合、言いかえれば形状の複雑さを導入し、しか
もこれらの境界線長の外接枠に対する相対的な大
きさを特徴とするので、文字パターンの大きさの
変化や位置のずれなどの種々の雑音に対して十分
安定にその特徴情報を得、入力パターンの大分類
識別を安定且つ効率良く行なうことができるの
で、文字認識処理の著しい向上を図り得る。また
漢字等の複雑な文字パターンであつても、また手
書文字を対象とする場合であつてもさらには活字
文字を対象とする場合であつても上記特徴は以下
に述べる簡単な方法によつて容易に得ることがで
きるので文字認識処理において実用性の高い顕著
な効果を奏し得る。 According to the present invention, a novel concept of a boundary line length between a character line part and a background part corresponding to each circumscribed side of the character pattern surrounded by circumscribed sides touching the input character pattern, that is, the length of the boundary line around the character This method introduces the degree of unevenness of the part, in other words, the complexity of the shape, and is characterized by the relative size of the length of these border lines with respect to the circumscribed frame, so it is difficult to prevent changes in the size of the character pattern or misalignment of the character pattern. Characteristic information can be obtained in a sufficiently stable manner against various noises, and major classifications of input patterns can be stably and efficiently identified, so character recognition processing can be significantly improved. Furthermore, even when dealing with complex character patterns such as kanji, handwritten characters, and even printed characters, the above features can be achieved using the simple method described below. Since it can be easily obtained, it can have a highly practical and remarkable effect in character recognition processing.

[Embodiments of the invention]

以下、図面を参照して本発明の一実施例につき
説明する。 Hereinafter, one embodiment of the present invention will be described with reference to the drawings.

第３図は本発明における入力文字パターンの特
徴抽出処理について示すものであり、ここでは手
書き入力された「天」なる文字パターンが示され
る。特徴抽出処理は、先ず与えられた文字パター
ンについて、これを囲む外接枠を検出することか
ら行われる。この外接枠は、文字パターンの最左
端に接する左外接辺Ｌ、同じく最右端に接する右
外接辺Ｒ、そして文字パターンの最上端に接する
上外接辺Ｕと最下端に接する下外接辺Ｄをそれぞ
れ検出することによつて求められる。尚、上記各
外接辺Ｕ，Ｄ，Ｌ，Ｒは文字パターンを走査し、
文字線の位置座標の限界値を計算することにより
容易に求められる。 FIG. 3 shows the feature extraction process of the input character pattern according to the present invention, and here, a handwritten input character pattern "Ten" is shown. The feature extraction process is performed by first detecting a circumscribing frame surrounding a given character pattern. This circumscribing frame has a left circumscribing edge L that touches the leftmost edge of the character pattern, a right circumscribing edge R that also touches the rightmost edge, an upper circumscribing edge U that touches the uppermost edge of the character pattern, and a lower circumscribing edge D that touches the lowermost edge. It is determined by detecting. In addition, each of the above circumscribed sides U, D, L, R scans the character pattern,
It can be easily obtained by calculating the limit value of the position coordinates of the character line.

しかるのち、求められた各外接辺Ｕ，Ｄ，Ｌ，
Ｒを基準位置（基準線）として、各外接辺Ｕ，
Ｄ，Ｌ，Ｒから文字パターン中心に向う方向にそ
れぞれ走査し、文字線に到達する迄の走査線長を
求める。但し、上記走査の最大長を文字パターン
の大きさに応じた一定の割合とすることが信号処
理上好ましい。そして、これらの走査線長間の差
の総和を各外接辺Ｕ，Ｄ，Ｌ，Ｒにそれぞれ対応
して算出し、この値を文字パターンの背景部と文
字線部の境界線長として検出する。例えば外接辺
Ｌにおける境界線の長さl_Lは、第３図に示すよう
に最大長をh_H（ｈ＞１）に規定し、Ｌを基準位置
として文字パターンの中心に向う方向に走査し、
個々の長さを求める。但し、走査線が文字線に到
達しなかつたときには前記長さh_Hに達したときに
その走査を終るとする。そして、このようにして
求められた走査線について、第４図の如く隣接す
る走査線間の長さの差（絶対値）を加算しさらに
文字の縦幅Ｖを加える。このようにして求められ
る外接辺Ｌに接する輪郭線長l_L（図４の太線の長
さ）は文字パターンの文字線の幅や文字の傾きや
位置ずれに対して安定化されに広域に亘る特徴量
である。しかして、同様な処理により、他の外接
辺Ｒ，Ｕ，Ｄにそれぞれ接する背景部についても
その輪郭線長を求める。 After that, each of the obtained circumscribed edges U, D, L,
With R as the reference position (reference line), each circumscribed side U,
Scanning is performed from D, L, and R in the direction toward the center of the character pattern, and the length of the scanning line until reaching the character line is determined. However, from the viewpoint of signal processing, it is preferable that the maximum length of the scanning be a constant ratio depending on the size of the character pattern. Then, the sum of the differences between these scanning line lengths is calculated for each circumscribed side U, D, L, and R, respectively, and this value is detected as the boundary line length between the background part and character line part of the character pattern. . For example, the length l L of the boundary line on the circumscribed side _L is defined as the maximum length h _H (h>1) as shown in Figure 3, and scanning is performed in the direction toward the center of the character pattern with L as the reference position. ,
Find each length. However, if the scanning line does not reach the character line, the scanning is terminated when the length h _H is reached. Then, for the scanning lines obtained in this manner, the difference in length (absolute value) between adjacent scanning lines is added as shown in FIG. 4, and then the vertical width V of the character is added. The length of the contour line tangent to the circumscribed side L _L (the length of the thick line in Figure 4) obtained in this way is stabilized against the width of the character line of the character pattern, the inclination and positional shift of the character, and covers a wide area. It is a feature quantity. Then, by similar processing, the contour line lengths of the background portions that are in contact with the other circumscribed sides R, U, and D, respectively, are determined.

その後、これらの値を文字パターンの外接枠の
大きさ、つまり縦幅および横幅について正規化し
た前記境界線長を C_L＝l_L／Ｈ＋Ｖ C_R＝l_R／Ｈ＋Ｖ C_U＝l_U／Ｈ＋Ｖ C_D＝l_D／Ｈ＋Ｖとして求める。但し、上式中Ｈ，Ｖは文字パター
ン外接枠の横幅および縦幅を示している。これに
よつて、入力された文字パターンの周囲部の特徴
を示す情報が、境界線長の組｛C_L，C_R，C_U，C_D｝
として求められる。この情報は、文字パターンの
外接辺Ｕ，Ｄ，Ｌ，Ｒから見た外形部の平均的凹
凸度合、言いかえれば文字周囲部の複雑さに相当
したものであり、従つて文字パターンを大分類識
別する上で文字パターンが有する特徴を十分反映
したものとなつている。すなわちこのようにして
求められた文字パターンの周囲部の特徴｛C_L，
C_R，C_U，C_D｝を用いることにより、上記文字パ
ターンを効果的に、且つ安定確実に大分類識別す
ることが可能となる。 After that, these values are normalized for the size of the circumscribing frame of the character pattern, that is, the height and width, and the border line length is calculated as C _L = l _L /H + V C _R = l _R /H + V C _U = l _U /H + V Calculate as C _D = l _D /H+V. However, H and V in the above formula indicate the horizontal width and vertical width of the character pattern circumscribing frame. As a result, information indicating the characteristics of the surrounding area of the input character pattern can be expressed as a set of boundary line lengths {C _L , C _R , C _U , C _D }
It is required as. This information corresponds to the average degree of unevenness of the outer shape seen from the circumscribed sides U, D, L, and R of the character pattern, in other words, the complexity of the surrounding area of the character, and therefore, the character pattern can be roughly classified. This sufficiently reflects the characteristics of character patterns for identification. In other words, the characteristics of the surrounding area of the character pattern obtained in this way {C _L ,
By using C _R , C _U , C _D }, it becomes possible to effectively, stably and reliably classify and identify the character patterns.

例えば第５図ａ，ｂ，ｃにそれぞれ示されるよ
うに手書きされた様々な形の文字「古」が入力さ
れた場合、上述した処理によりその文字パターン
上辺部の境界線長l₁，l₂，l₃を求めれば、これら
はいずれも第５図ｄの形状のパターンのl₄（＝H₄
＋hV₄）を計数することに一致する。従つて、こ
れらの特徴情報を文字の大きさで正規化すれば、
ＨとＶの比が同一である限り、特徴値は同一とな
る。すなわち｛C_L，C_R，C_U，C_D｝は文字の変形
に対して安定化された特徴であると言える。 For example, when the handwritten characters "Ko" in various shapes are input as shown in _FIG _. , l ₃ , these are all l ₄ (=H ₄
+hV ₄ ). Therefore, if we normalize these feature information by font size,
As long as the ratio of H and V is the same, the feature values are the same. In other words, {C _L , C _R , C _U , C _D } can be said to be a feature that is stabilized against character deformation.

第６図はこのような処理を施して文字パターン
の大分類識別の為の特徴を抽出する本発明の一実
施例装置を示す概略構成図である。 FIG. 6 is a schematic configuration diagram showing an embodiment of an apparatus of the present invention which performs such processing to extract features for broad classification identification of character patterns.

読取り対象である文字パターン１は、例えばテ
レビジヨンカメラ等からなる走査光電変換装置２
により光電変換され、文字面を走査して入力され
る。この光電変換装置２を介して入力された文字
パターン１の像信号は２値量子化装置３に入力さ
れ、例えば背景濃度を基準として定められた弁別
レベルにて弁別されて２値量子化される。そし
て、量子化されてなる文字パターン信号は、走査
位置に対応する２値画素信号として１フレームメ
モリ等のパターン記憶装置４に記憶される。この
記憶装置４に記憶格納された文字パターン画素信
号が所定方向に走査して読出され、前述した特徴
抽出処理に供される。 A character pattern 1 to be read is read by a scanning photoelectric conversion device 2 consisting of, for example, a television camera.
The data is photoelectrically converted and input by scanning the character surface. The image signal of the character pattern 1 inputted via the photoelectric conversion device 2 is inputted to the binary quantization device 3, where it is discriminated at a discrimination level determined based on, for example, the background density and is binary quantized. . The quantized character pattern signal is then stored in a pattern storage device 4 such as a one-frame memory as a binary pixel signal corresponding to the scanning position. The character pattern pixel signals stored in the storage device 4 are scanned in a predetermined direction and read out, and subjected to the feature extraction process described above.

即ち、外接枠検出装置５は、上記記憶装置４に
格納された文字パターンの文字線の位置座標をサ
ーチし、その最左端座標X_L、その最右端座標X_R
を求め、左外接辺ＬをＬ＝X_Lとして、また右外
接辺ＲをＲ＝X_Rとして求めている。同時に文字
パターンの文字線の最上端座標Y_Uと最下端座標
Y_Dを求め、上外接辺Ｕと下外接辺Ｄを求めてい
る。そして、外接枠検出装置５は、これらの外接
辺Ｌ，Ｒ，Ｕ，Ｄの情報を走査回路６Ｌ，６Ｒ，
６Ｕ，６Ｄに制御情報としてそれぞれ与えてい
る。しかして、各走査回路６Ｌ，６Ｒ，６Ｕ，６
Ｄでは、与えられた外接辺の情報から走査開始の
基準線を定め、前記記憶装置４に格納された文字
パターンの信号を上記基準線から文字パターンの
中心に向う方向に、（左→右），（右→左），（上→
下），（下→上）へと順次走査し、各外接辺に対応
した（接した）走査線長を求めている。このよう
にして求められた走査線の情報は境界線計算部７
Ｌ，７Ｒ，７Ｕ，７Ｄに入力され、前述したよう
にその走査線長の差の総和l_L，l_R，l_U，l_Dをそれぞ
れ求め、これを正規化して相対的な量であるC_L，
C_R，C_U，C_Dが求められるこれらの情報が統括さ
れ、入力文字パターンの周囲部の特徴情報｛C_L，
C_R，C_U，C_D｝として特徴比較装置８に与えられ
る。 That is, the circumscribing frame detection device 5 searches for the positional coordinates of the character line of the character pattern stored in the storage device 4, and determines its leftmost coordinate X _L and its rightmost coordinate X _R
The left circumscribed side L is determined as L=X _L , and the right circumscribed side R is determined as R=X _R. At the same time, the uppermost coordinate Y _U and the lowermost coordinate of the character line of the character pattern
Find Y _D and find the upper circumscribed side U and lower circumscribed side D. Then, the circumscribing frame detection device 5 sends information about these circumscribing sides L, R, U, and D to scanning circuits 6L, 6R,
It is given to 6U and 6D as control information, respectively. Therefore, each scanning circuit 6L, 6R, 6U, 6
In D, a reference line for starting scanning is determined from the information on the given circumscribed side, and the signal of the character pattern stored in the storage device 4 is moved in the direction from the reference line toward the center of the character pattern (from left to right). , (right → left), (top →
(lower) and (lower to upper), and the scanning line length corresponding to (contacting) each circumscribed edge is determined. The scanning line information obtained in this way is stored in the boundary line calculation section 7.
L, 7R, 7U, and 7D, and as mentioned above, calculate the total sums l _L , l _R , l _U , and l _D of the scanning line length differences, respectively, and normalize these to obtain the relative amount C _L ,
These pieces of information that require C _R , C _U , and C _D are integrated, and feature information around the input character pattern {C _L ,
C _R , C _U , C _D } are given to the feature comparison device 8.

特徴辞書９には、予め認識の対象となる各文字
に対する上記したような特徴情報が登録されてお
り、特徴比較装置８はこれらの辞書特徴と前記求
められた入力文字パターンの特徴との類似性（類
似度）を順次計算している。そして、その類似度
が、所定の許容値θ以上とき、これを得た特徴辞
書のカテゴリーを大分類識別結果として出力して
いる。 In the feature dictionary 9, the above-mentioned feature information for each character to be recognized is registered in advance, and the feature comparison device 8 compares the similarities between these dictionary features and the obtained features of the input character pattern. (similarity) is calculated sequentially. Then, when the degree of similarity is greater than or equal to a predetermined tolerance value θ, the category of the obtained feature dictionary is output as a major classification identification result.

このように、本発明に係る特徴抽出と、これに
基づく大分類識別を行う実施例装置は非常に簡単
に実現できる。そして、このようにして得られた
大分類識別結果と、入力文字パターンの像信号と
を次段の文字認識部に与えれば、その文字認識を
簡易に且つ効率良く行うことが可能となる。しか
も上述したように手書文字等の文字パターンの変
形や種々の雑音に対して安定なので、認識処理の
効率を著しく向上せしめ得る。従つて、漢字を含
む多くの文字を認識対象とする実用性の高い文字
読取り認識システムを構築できる等の絶大なる効
果を奏する。 In this way, the embodiment device that performs feature extraction and broad classification identification based on the feature extraction according to the present invention can be realized very easily. Then, by feeding the thus obtained large classification identification results and the image signal of the input character pattern to the character recognition section at the next stage, the character recognition can be performed easily and efficiently. Moreover, as described above, it is stable against deformation of character patterns such as handwritten characters and various noises, so that the efficiency of recognition processing can be significantly improved. Therefore, it is possible to construct a highly practical character reading recognition system that recognizes many characters including Chinese characters, and has great effects.

尚、本発明は上記した実施例にのみ限定される
ものではない。例えば、実施例では境界線長を隣
接する走査線の長さの差の総和として求めたが、
類似効果を得るものとして文字部の輪郭を直接追
跡して境界線長を求めるようにしてもよい。この
ようにすれば、データ処理量は増えるがより詳細
また文字線や背景部正確な境界線長を得ることが
できる。特徴情報等も併せて大分類識別を行うよ
うにしてもよい。要するに本発明はその要旨を逸
脱しない範囲で種々変形して実施することができ
る。 Note that the present invention is not limited only to the above-described embodiments. For example, in the example, the boundary line length was determined as the sum of the differences in the lengths of adjacent scanning lines, but
To obtain a similar effect, the boundary line length may be determined by directly tracing the outline of the character portion. In this way, although the amount of data processing increases, it is possible to obtain more detailed and accurate border line lengths for character lines and background parts. Major classification identification may also be performed using feature information and the like. In short, the present invention can be implemented with various modifications without departing from the gist thereof.

[Brief explanation of the drawing]

第１図および第２図は従来の文字パターンの特
徴抽出の概念を説明する為の図、第３図は本発明
に係る文字パターンの特徴抽出の概念を説明する
為の図、第４図は本発明による文字パターンの特
徴抽出の具体的方法を説明する為の図、第５図ａ
〜ｄは本発明の効果を示す為の図、第６図は本発
明の一実施例装置の要部概略構成図である。１…文字パターン、２…走査光電変換装置、３
…２値量子化装置、４…パターン記憶装置、５…
外接枠検出装置、６Ｌ，６Ｒ，６Ｕ，６Ｄ…走査
回路、７Ｌ，７Ｒ，７Ｕ，７Ｄ…境界線計算部、
８…特徴比較装置、９…特徴辞書。 Figures 1 and 2 are diagrams for explaining the concept of conventional character pattern feature extraction, Figure 3 is a diagram for explaining the concept of character pattern feature extraction according to the present invention, and Figure 4 is a diagram for explaining the concept of character pattern feature extraction according to the present invention. FIG. 5a is a diagram for explaining a specific method of character pattern feature extraction according to the present invention.
-d are diagrams for showing the effects of the present invention, and FIG. 6 is a schematic diagram of the main part of an apparatus according to an embodiment of the present invention. 1... Character pattern, 2... Scanning photoelectric conversion device, 3
...Binary quantization device, 4...Pattern storage device, 5...
Circumscribing frame detection device, 6L, 6R, 6U, 6D...scanning circuit, 7L, 7R, 7U, 7D...boundary line calculation unit,
8...Feature comparison device, 9...Feature dictionary.

Claims

[Scope of Claims] 1. Means for photoelectrically converting and inputting the original character pattern, means for binary quantizing and storing the input original character pattern, and four directions of the stored original character pattern, top, bottom, left and right. means for detecting a circumscribed frame consisting of circumscribed edges; means for detecting an outline of a character background portion in a direction from each of the circumscribed edges toward the center of the original character; and an outline of a character background portion corresponding to each of these circumscribed edges; Means for detecting an average amount of unevenness around the original character pattern from the relative length of a line length with respect to the circumscribed frame, and means for broadly classifying the original character pattern from the detected average amount of unevenness. A character reading device characterized by: 2 The means for detecting the outline length of the character background area is to scan from each circumscribed edge in the direction toward the center of the original character pattern and accumulate the differences in scanning distance from the circumscribed edge to the character line. A character reading device according to claim 1.