JPS6120036B2

JPS6120036B2 -

Info

Publication number: JPS6120036B2
Application number: JP56208306A
Authority: JP
Inventors: Kunio Sakai
Original assignee: Tokyo Shibaura Electric Co Ltd
Current assignee: Toshiba Corp
Priority date: 1981-12-23
Filing date: 1981-12-23
Publication date: 1986-05-20
Also published as: JPS58109980A

Description

【発明の詳細な説明】本発明は漢字を含む多数の文字を読取り対象と
して、安定に且つ効率良く大分類識別の特徴を出
することのできる実用性の高い文字読取り装置に
関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a highly practical character reading device that can read a large number of characters, including Chinese characters, and stably and efficiently output features for major classification identification.

発明の技術的背景入力文字パターンの特徴を出し、辞書に予め登
録されている辞書パターンの特徴との照合を行
い、その照合結果に従つて上記入力文字パターン
を読取り認識することが行われている。ところが
この文字認識において漢字を含む多数の文字を読
取り対象とする場合、上記特徴の照合処理が非常
に膨大な量となり、処理効率が著しく悪くなる。
そこで従来では、上記認識処理に先立つて認識対
象文字を複数の概略的な特徴によつて大分類し、
これによつて識別段階での照合文字数を低減して
処理効率の向上、処理速度の高速化を図ることが
行われている。この際、上記大分類に用いる特徴
と候補文字識別に用いる特徴とに同じものを用い
ることが必要であるが、この特徴は手書文字等に
起因する文字パターンの変形や種々の雑音に対し
て十分安定であることが必要である。Technical Background of the Invention The characteristics of an input character pattern are extracted and compared with the characteristics of dictionary patterns registered in advance in a dictionary, and the input character pattern is read and recognized according to the results of the comparison. . However, when a large number of characters including Chinese characters are to be read in this character recognition, the amount of processing required to match the above-mentioned features becomes extremely large, and the processing efficiency deteriorates significantly.
Therefore, conventionally, prior to the above recognition process, characters to be recognized are roughly classified based on a plurality of general characteristics.
This reduces the number of characters to be compared in the identification stage, thereby improving processing efficiency and speeding up the processing speed. At this time, it is necessary to use the same features as the features used for the above-mentioned major classification and the features used for candidate character identification, but this feature is necessary to prevent deformation of character patterns caused by handwritten characters and various noises. It needs to be sufficiently stable.

しかして、従来、このような文字パターンの大
分類に用いられる特徴としては、文字線の複雑さ
に着目したものや、文字周辺部の形状に着目した
もの、更には文字全体の粗い形状に着目したもの
等がある。尚、これらについては、例えば下記の
文献に詳しく紹介されている。 However, conventionally, the features used to broadly classify character patterns have focused on the complexity of the character lines, the shape of the peripheral part of the character, and even the rough shape of the entire character. There are things that have been done. In addition, these are introduced in detail in, for example, the following literature.

坂井、渡辺 “印刷漢字認識の現状” 情報処理、Vol22 No.4PP.274〜279 （昭和56年４月）第１図はその一例を示すもので、原文字パター
ンの周囲の形状に着目し、その周辺部上下左右に
矩形状に設けた走査領域Ｕ，Ｄ，Ｌ，Ｒ内に存在
する文字線を検出し、その量を「０」，「１」，
「２」なる３段階のレベルに量子化してこれを特
徴データとするものである。また第２図ａに示す
ものは、原文字パターンに接する外接辺ｕ，ｄ，
ｌ，ｒに着目し、これらの外接辺ｕ，ｄ，ｌ，ｒ
に接する文字背景部の面積を求め、これを多次元
の特徴ベクトルとして利用するものである。前者
（第１図）は文字線部に着目するのに対し、後者
（第２図ａ）は背景部面積に着目する点を異にし
ているが、いずれも文字パターンの周囲における
面積的数量を特徴としていると云える。 Sakai, Watanabe “Current Status of Printed Kanji Recognition” Information Processing, Vol. 22 No. 4 PP. 274-279 (April 1981) Figure 1 shows an example. Focusing on the shape around the original character pattern, The character lines existing in the rectangular scanning areas U, D, L, and R on the upper, lower, left, and right sides of the surrounding area are detected, and the amount is set to "0", "1",
It is quantized into three levels of "2" and used as feature data. In addition, the one shown in Fig. 2a is the circumscribed edges u, d, which touch the original character pattern,
Focusing on l, r, these circumscribed edges u, d, l, r
The area of the character background that is in contact with is determined and used as a multidimensional feature vector. The difference is that the former (Figure 1) focuses on the character line area, while the latter (Figure 2 a) focuses on the area of the background area, but both of them focus on the area quantity around the character pattern. It can be said that it is a characteristic.

背景技術の問題点ところが、上記特徴出は、活字文字等のように
そのパターンが規格されている場合には非常に有
効であるが、手書文字のように文字としての特徴
を有しながらも大きく変形しているような場合に
は極めて不安定である。しかも、例え活字文字で
あつても、第２図ｂに示すように文字パターンの
欠けやかすれが存在する場合、その特徴抽出は甚
だ不安定なものとなる。換言すれば文字パターン
の変形や種々の雑音に対して常に安定に特徴抽出
を行い得ないと云う問題を有している。Problems with the Background Art However, the above feature extraction is very effective in cases where the pattern is standardized, such as with printed characters, but when the pattern is standardized, such as with handwritten characters, even though the pattern has characteristics as a character, If it is significantly deformed, it is extremely unstable. Moreover, even if the character is a printed character, if the character pattern is chipped or faded as shown in FIG. 2b, the feature extraction becomes extremely unstable. In other words, there is a problem in that feature extraction cannot always be performed stably against deformation of character patterns and various noises.

発明の目的本発明はこのような事情を考慮してなされたも
ので、その目的とするところは、手書きされた漢
字を含む多数の文字を読取り対象として、入力さ
れた原文字パターンの大分類識別に必要な特徴を
安定に且つ効率良く抽出することのできる実用性
の高い文字読取り装置を提供することにある。Purpose of the Invention The present invention has been made in consideration of the above circumstances, and its purpose is to broadly classify input original character patterns by reading a large number of characters including handwritten kanji. It is an object of the present invention to provide a highly practical character reading device that can stably and efficiently extract the features necessary for the text.

発明の概要本発明は文字の大分類識別に必要な分類情報
（特徴）が文字パターンの周囲部に多く存在する
ことに着目し、上記文字パターンの周囲部におい
て、より広域での特徴量を出し、絶対的な位置情
報に代えてて相対的位置情報を用い、且つ特徴間
の位置関係を新たな特徴として用いることを可能
としたものであり、入力された原文字パターンの
外接辺から文字パターン中心に向う背景部の重心
位置を各外接辺にそれぞれ対応して求め、これら
の重心位置の外接枠に対する相対位置から文字パ
ターンの平均的窪み量を求めてこれを大分類識別
の特徴として抽出するようにした文字読取り装置
にある。Summary of the Invention The present invention focuses on the fact that a large amount of classification information (features) necessary for large classification identification of characters exists in the periphery of a character pattern, and calculates feature quantities in a wider area around the character pattern. , it is possible to use relative position information instead of absolute position information and to use the positional relationship between features as a new feature. The position of the center of gravity of the background toward the center is determined for each circumscribing edge, and the average amount of indentation of the character pattern is determined from the relative position of these center of gravity positions with respect to the circumscribed frame, and this is extracted as a feature for major classification identification. It is in a character reading device that has been designed.

発明の効果従つて本発明によれば、入力された文字パター
ンに接する外接辺によつて囲まれる上記文字パタ
ーンの各外接辺に対応した背景部の重心位置情報
と云う新規な概念を導入し、しかもこれらの重心
位置の外接枠に対する相対的な位置情報を特徴と
するので、文字パターンの変形や種種の雑音に対
して十分安定にその特徴情報を得ることが可能と
なる。故に、欠けやかすれ等を生じた活字文字で
あつても、あるいは手書きされた文字であつて
も、その大分類識別に必要な特徴を安定に且つ効
率良く得ることができるので、文字認識処理の著
しい向上を図り得る。また漢字等の複雑な文字パ
ターンであつて、且つ手書文字を対象とする場合
であつても上記背景部の重心位置とその相対的な
位置関係から、原文字パターンの特徴を強く反映
した情報を簡易に得ることができる等、文字認識
処理において実用性の高い顕著な効果を奏し得
る。Effects of the Invention Therefore, according to the present invention, a novel concept of center-of-gravity position information of a background portion corresponding to each circumscribed side of the character pattern surrounded by circumscribed sides touching the input character pattern is introduced, Moreover, since the feature is the relative positional information of these centroid positions with respect to the circumscribed frame, it is possible to obtain the feature information with sufficient stability against deformation of character patterns and various types of noise. Therefore, even if the printed characters are chipped or faded, or even if they are handwritten, the characteristics necessary for broad classification identification can be obtained stably and efficiently, making it easier for character recognition processing. Significant improvements can be made. In addition, even when dealing with complex character patterns such as kanji and handwritten characters, information that strongly reflects the characteristics of the original character pattern is obtained from the center of gravity of the background and its relative position. It is possible to achieve a highly practical and remarkable effect in character recognition processing, such as being able to easily obtain .

発明の実施例以下、図面を参照して本発明の一実施例につき
説明する。Embodiment of the Invention Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

第３図は本装置における入力文字パターンの特
徴抽出処理について示すものであり、ここでは手
書き入力された「上」なる文字パターンが示され
る。特徴抽出処理は、先ず与えられた文字パター
ンについて、これを囲む外接枠を検出することか
ら行われる。この外接枠は、文字パターンの最左
端に接する左外接辺Ｌ、同じく最右端に接する右
外接辺Ｒ、そして文字パターンの最上端に接する
上外接辺Ｕと最下端に接する下外接辺Ｄをそれぞ
れ検出することによつて求められる。尚、上記各
外接辺Ｕ，Ｄ，Ｌ，Ｒは文字パターンを走査検出
し、文字線の位置座標を相互に比較することによ
り容易に達せられる。 FIG. 3 shows the feature extraction process of the input character pattern in this apparatus, and here the character pattern "上" input by hand is shown. The feature extraction process is performed by first detecting a circumscribing frame surrounding a given character pattern. This circumscribing frame has a left circumscribing edge L that touches the leftmost edge of the character pattern, a right circumscribing edge R that also touches the rightmost edge, an upper circumscribing edge U that touches the uppermost edge of the character pattern, and a lower circumscribing edge D that touches the lowermost edge. It is determined by detecting. The circumscribed sides U, D, L, and R can be easily determined by scanning and detecting the character pattern and comparing the position coordinates of the character lines.

しかるのち、求められた各外接辺Ｕ，Ｄ，Ｌ，
Ｒを基準位置（基準線）として、各外接辺Ｕ，
Ｄ，Ｌ，Ｒから文字パターン中心に向う方向にそ
れぞれ走査し、文字線に到達する迄の走査線長を
求める。但し、上記走査の最大長を例えば長さｌ
（可変）として文字パターンの大きさに応じて規
定するようにすることが信号処理上好ましい。そ
して、これらの走査線長の総和で示される各外接
辺Ｕ，Ｄ，Ｌ，Ｒにそれぞれ対応した文字パター
ンの背景部の面積Ｓを求める。例えば外接辺Ｕに
おける背景部の面積Ｓは、第３図に示すように最
大長ｌに規定された走査線にて外接辺Ｕを基準位
置として文字パターンの中心に向う方向に走査
し、その総和を求める。但し、走査線が文字線に
到達したとき、その走査は終了し、また文字線に
到達しなかつたときには前記長さｌに達したとき
にその走査線を終了とする。そして、このように
して求められた面積Ｓで示される領域の各点を
（ｘ，ｙ）として、領域Ｓの重心位置Ｇ（Ｘ，
Ｙ）を次のようにして求める。 After that, each of the obtained circumscribed sides U, D, L,
With R as the reference position (reference line), each circumscribed side U,
Scanning is performed from D, L, and R in the direction toward the center of the character pattern, and the length of the scanning line until reaching the character line is determined. However, if the maximum length of the above scan is set to, for example, the length l
It is preferable in terms of signal processing to specify the value as (variable) depending on the size of the character pattern. Then, the area S of the background portion of the character pattern corresponding to each of the circumscribed sides U, D, L, and R indicated by the sum of these scanning line lengths is determined. For example, the area S of the background part on the circumscribed side U is determined by scanning in the direction toward the center of the character pattern with the circumscribed side U as the reference position using a scanning line defined to the maximum length l as shown in Fig. seek. However, when the scanning line reaches the character line, the scanning ends, and if it does not reach the character line, the scanning line ends when the length l is reached. Then, assuming that each point of the area indicated by the area S obtained in this way is (x, y), the center of gravity position G(X, y) of the area S is
Y) is determined as follows.

Ｘ＝〓_ｓｘ．Ｓ_{（ｘ．ｙ）}ｄ_ｘｄ_ｙ／〓_ｓＳ_{（ｘ．ｙ
）}ｄ_ｘｄ_ｙＹ＝〓_ｓｙ．Ｓ_{（ｘ．ｙ）}ｄ_ｘｄ_ｙ／〓_ｓＳ_{（ｘ．ｙ
）}ｄ_ｘｄ_ｙこのようにして求められる外接辺Ｕに接する背
景部の重心位置Ｇ（X.Y）は文字パターンの文字
線に対する部分的な欠け等に対して安定化された
広域に亘つて文字パターンの特徴を反映した情報
となる。しかして、同様な処理により、他の外接
辺Ｄ，Ｌ，Ｒにそれぞれ接する背景部についても
その重心位置を求める。 X=〓 _s x. S _(x.y) d _x _dy /〓 _s S _{(x.y
)} d _x d _y Y=〓 _s y. S _(x.y) d _x _dy /〓 _s S _{(x.y
)} _d _{_} The information reflects the characteristics of Accordingly, the centroid positions of the background portions touching the other circumscribed sides D, L, and R, respectively, are determined by similar processing.

その後、これらの重心位置の情報から、文字パ
ターンの外接枠の大きさ、つまり縦幅および横幅
について正規化した前記領域Ｓの重心Ｇの文字枠
に対する相対位置PuをＰ_u＝（Ｘ／Ｈ，Ｙ／Ｖ）として求める。但し、上式中Ｈ，Ｖは文字パター
ン外接枠の横幅および縦幅を示している。この正
規化処理を、各外接辺Ｕ，Ｄ，Ｌ，Ｒに接する背
景部の重心位置Ｇについてそれぞれ行う。これに
よつて、入力された文字パターンの周囲部の特徴
を示す情報が、正規化重心位置｛Ｐ_U，Ｐ_D，Ｐ
_L，Ｐ_R｝として求められる。この情報は、文字パ
ターンの外接辺Ｕ，Ｄ，Ｌ，Ｒから見た外形部の
平均的窪み量に相当したものであり、従つて文字
パターンを大分類識別する上で文字パターンが有
する特徴を十分反映したものとなつている。従つ
て、このようにして求められた文字パターンの周
囲部の特徴｛Ｐ_U，Ｐ_D，Ｐ_L，Ｐ_R｝を用いること
により、上記文字パターンを効果的に、且つ安定
確実に大分類識別することが可能となる。 Then, from the information on these centroid positions, the relative position Pu of the centroid G of the area S normalized for the size of the circumscribing frame of the character pattern, that is, the vertical and horizontal widths, with respect to the character frame is calculated as P _u = (X/H, Y/V). However, H and V in the above formula indicate the horizontal width and vertical width of the character pattern circumscribing frame. This normalization process is performed for each of the centroid positions G of the background portion that are in contact with each of the circumscribed sides U, D, L, and R. As a result, information indicating the characteristics of the surrounding area of the input character pattern is converted to the normalized center of gravity position {P _U , P _D , P
_L , P _R }. This information corresponds to the average amount of indentation of the outer shape seen from the circumscribed sides U, D, L, and R of the character pattern, and therefore, the characteristics of the character pattern can be used to roughly classify and identify character patterns. This is a sufficient reflection. Therefore, by using the characteristics of the surrounding area of the character pattern obtained in this way {P _U , P _D , _PL , P _R }, the character pattern can be effectively, stably and reliably classified into major categories. It becomes possible to do so.

例えば第４図ａ，ｂにそれぞれ示されるように
大きさが異なり、しかも手書きされた文字
「区」、「エ」が入力された場合、上述した処理に
よりその文字パターン周囲部の特徴を求めれば｛P¹ _U，P¹ _D，P¹ _L，P¹ _R｝｛P² _U，P² _D，P² _L，P² _R｝として示される。従つて、これらの特徴情報を第
４図ｃに示すように重ね合わせて対比してみれば
明らかなように、その平均的窪み量に顕著な差が
あることが判る。特に同じ外接辺に対応する特徴
を対比してみれば、例えば正規化重心位置間の距
離についてみれば、特徴の差異が距離の大きさと
して文字パターンの特徴の違いを大きく反映して
いることが判る。従つて、両者の類似性につい
て、距離（P¹ _U，P² _U），（P¹ _D，P² _D），（P¹ _L，P² _L）
（P¹ _R，P² _R）についてそれぞれ求めれば総合的に判
定することができ、ここに大分類識別を容易に可
能とする。しかも算術的な処理によつて上記判定
を効果的に行うことができる。 For example, if the characters ``ku'' and ``e'', which have different sizes and are handwritten, are input as shown in Figure 4 a and b, respectively, then the characteristics of the surrounding area of the character pattern can be determined by the process described above. It is shown as {P ¹ _U , P ¹ _D , P ¹ _L , P ¹ _R } {P ² _U , P ² _D , P ² _L , P ² _R }. Therefore, if these characteristic information are superimposed and compared as shown in FIG. 4c, it is clear that there is a remarkable difference in the average amount of depression. In particular, if we compare features that correspond to the same circumscribed edge, for example the distance between the normalized centroid positions, we can see that the difference in features largely reflects the difference in character pattern features as the size of the distance. I understand. Therefore, regarding the similarity between the two, the distances (P ¹ _U , P ² _U ), (P ¹ _D , P ² _D ), (P ¹ _L , P ² _L )
If each of (P ¹ _R , P ² _R ) is determined, a comprehensive judgment can be made, and it is possible to easily identify major classifications. Furthermore, the above determination can be effectively performed through arithmetic processing.

第５図はこのような処理を施して文字パターン
の大分類識別の為の特徴を抽出する本発明の一実
施例装置を示す概略構成図である。 FIG. 5 is a schematic configuration diagram showing an embodiment of an apparatus of the present invention which performs such processing to extract features for broad classification identification of character patterns.

読取り対象である文字パターン１は、例えばテ
レビジヨンカメラ等からなる走査光電変換装置２
により光電変換され、文字面を走査して入力され
る。この光電変換装置２を介して入力された文字
パターン１の像信号は２値量子化装置３に入力さ
れ、例えば背景濃度を基準として定められた弁別
レベルにて弁別されて２値量子化される。そし
て、量子化されてなる文字パターン信号は、走査
位置に対応する２値画素信号として１フレームメ
モリ等のパターン記憶装置４に記憶される。この
記憶装置４に記憶格納された文字パターン画素信
号が所定方向に走査して読出され、前述した特徴
抽出処理に供される。 A character pattern 1 to be read is read by a scanning photoelectric conversion device 2 consisting of, for example, a television camera.
The data is photoelectrically converted and input by scanning the character surface. The image signal of the character pattern 1 inputted via the photoelectric conversion device 2 is inputted to the binary quantization device 3, where it is discriminated at a discrimination level determined based on, for example, the background density and is binary quantized. . The quantized character pattern signal is then stored in a pattern storage device 4 such as a one-frame memory as a binary pixel signal corresponding to the scanning position. The character pattern pixel signals stored in the storage device 4 are scanned in a predetermined direction and read out, and subjected to the feature extraction process described above.

即ち、外接枠検出装置５は、上記記憶装置４に
格納された文字パターンの文字線の位置座標をサ
ーチし、その最左端座標Ｘ_L、その最右端座標Ｘ_R
を求め、左外接辺ＬをＬ＝Ｘ_Lとして、また右外
接辺ＲをＲ＝Ｘ_Rとして求めている。同時に文字
パターンの文字線の最上端座標Ｙ_Uと最下端座標
Ｙ_Dを求め、上外接辺Ｕと下外接辺Ｄを求めてい
る。そして、外接枠検出装置５は、これらの外接
枠辺Ｌ，Ｒ，Ｕ，Ｄの情報を走査回路６_L，６_R，
６_U，６_Dに制御情報としてそれぞれ与えている。
しかして、各走査回路６_L６_R，６_U，６_Dでは、与
えられた外接辺の情報から走査開始の基準線を定
め、前記記憶装置４に格納された文字パターンの
信号を上記基準線から文字パターンの中心に向う
方向に、（左→右），（右→左），（上→下），（下→
上）へと順次走査し、各外接辺に対応した（接し
た）走査線長の総和を求めている。つまり各外接
辺をそれぞれ基準とした背景部領域Ｓを求めてい
る。このようして求められた領域Ｓの情報を得る
重心計算部７_L，７_R，７_U，７_Dは、前述したよう
にその重心位置Ｇ（X.Y）をそれぞれ求め、これ
を正規化して相対的な位置情報である正規化重心
位置Ｐ_L，Ｐ_R，Ｐ_U，Ｐ_Dをそれぞれ求めている。
これらの情報が総括され、入力文字パターンの周
囲部の特徴情報｛Ｐ_L，Ｐ_R，Ｐ_U，Ｐ_D｝として特
徴比較装置８に与えられる。 That is, the circumscribing frame detection device 5 searches for the positional coordinates of the character line of the character pattern stored in the storage device 4, and determines its leftmost coordinate X _L and its rightmost coordinate X _R
The left circumscribed side L is determined as L=X _L , and the right circumscribed side R is determined as R=X _R. At the same time, the uppermost end coordinate Y _U and the lowermost end coordinate Y _D of the character line of the character pattern are determined, and the upper circumscribed side U and lower circumscribed side D are determined. Then, the circumscribing frame detection device 5 sends information about these circumscribing frame sides L, R, U, and D to scanning circuits 6 _L , 6 _R ,
6 _U and 6 _D are given as control information, respectively.
In each of the scanning circuits 6 _L 6 _R , 6 _U , and 6 _D , a reference line for starting scanning is determined based on the information on the given circumscribed side, and the signal of the character pattern stored in the storage device 4 is transferred to the reference line. In the direction from to the center of the character pattern, (left → right), (right → left), (top → bottom), (bottom →
(above) and calculates the sum of the scanning line lengths corresponding to (contacting) each circumscribed side. In other words, the background area S is calculated using each circumscribed side as a reference. The center of gravity calculation units 7 _L , 7 _R , 7 _U , and 7 _D that obtain information on the area S obtained in this way calculate the center of gravity positions G (XY), respectively, as described above, normalize these, and relative The normalized center of gravity positions P _L , P _R , P _U , and _PD , which are general position information, are obtained respectively.
These pieces of information are summarized and given to the feature comparison device 8 as feature information { _PL , _PR , _PU , _PD } of the surrounding area of the input character pattern.

特徴辞書９には、予め各文字に対する上記した
ような特徴情報が登録されており、特徴比較装置
８はこれらの辞書特徴と前記求められた入力文字
パターンの特徴との類似性（類似度）を順次計算
している。そして、その類似度が、所定の許容値
θ以上のとき、これを得た特徴辞書のカテゴリー
を大分類識別結果として出力している。 In the feature dictionary 9, the above-mentioned feature information for each character is registered in advance, and the feature comparison device 8 calculates the similarity (degree of similarity) between these dictionary features and the obtained features of the input character pattern. Calculated sequentially. Then, when the degree of similarity is greater than or equal to a predetermined tolerance value θ, the category of the obtained feature dictionary is output as a major classification identification result.

このように、本発明に係る特徴抽出と、これに
基づく大分類識別を行う実施例装置は非常に簡単
に実現できる。そして、このようにして得られた
大分類識別結果と、入力文字パターンの像信号と
を次段の文字認識部に与えれば、その文字認識を
簡易に且つ効率良く行うことが可能となる。しか
も上述したように手書文字等の文字パターンの変
形や種々の雑音に対して安定なので、認識処理の
効率を著しく向上せしめ得る。従つて、漢字を含
む多くの文字を認識対象とする実用性の高い文字
読取り認識システムを構築できる等の絶大なる効
果を奏する。 In this way, the embodiment device that performs feature extraction and broad classification identification based on the feature extraction according to the present invention can be realized very easily. Then, by feeding the thus obtained large classification identification results and the image signal of the input character pattern to the character recognition section at the next stage, the character recognition can be performed easily and efficiently. Moreover, as described above, it is stable against deformation of character patterns such as handwritten characters and various noises, so that the efficiency of recognition processing can be significantly improved. Therefore, it is possible to construct a highly practical character reading recognition system that recognizes many characters including Chinese characters, and has great effects.

尚、本発明は上記した実施例にのみ限定される
ものではない。例えば、実施例では領域Ｓの重心
位置を特徴情報として求めたが、類似効果を得る
ものとして領域Ｓの中心をＸ＝〓_ｓＳ_{（ｘ．ｙ）}ｄ_ｘｄ_ｙ／Ｈ・ＶＹ＝〓_ｓＳ_{（ｘ，ｙ）}ｄ_ｘｄ_ｙ／Ｈ・Ｖとして求め、これを特徴情報としてもよい。この
ようにすれば、データ処理と装置構成の簡略化を
図ることができる。また文字線の特徴情報等も併
せて大分類識別を行うようにしてもよい。要する
に本発明はその要旨を逸脱しない範囲で種々変形
して実施することができる。 Note that the present invention is not limited only to the above-described embodiments. For example, in the embodiment, the center of gravity of the region S was obtained as the feature information, but to obtain a similar effect, the center of the region S was determined as X=〓 _s S _(x.y) d _x _dy /H・V Y= It may be obtained as _s S _{(x, y)} d _x _dy /H·V, and this may be used as the feature information. In this way, data processing and device configuration can be simplified. Furthermore, feature information of character lines may also be used for broad classification identification. In short, the present invention can be implemented with various modifications without departing from the gist thereof.

[Brief explanation of the drawing]

第１図および第２図ａ，ｂは従来の文字パター
ンの特徴抽出の概念を説明する為の図、第３図は
本発明に係る文字パターンの特徴抽出の概念を説
明する為の図、第４図ａ〜ｃは本発明による文字
パターンの特徴によつて示される文字パターンの
差異を説明する為の図、第５図は本発明の一実施
例装置の要部概略構成図である。１……文字パターン、２……走査光電変換装
置、３……２値量子化装置、４……パターン記憶
装置、５……外接枠検出装置、６_L，６_R，６_U，
６_D……走査回路、７_L，７_R，７_U，７_D……重心
計算部、８……特徴比較装置、９……特徴辞書。 1 and 2 a and b are diagrams for explaining the concept of conventional character pattern feature extraction; FIG. 3 is a diagram for explaining the concept of character pattern feature extraction according to the present invention; 4a to 4c are diagrams for explaining the differences in character patterns indicated by the characteristics of character patterns according to the present invention, and FIG. 5 is a schematic diagram of the main part of an apparatus according to an embodiment of the present invention. 1... Character pattern, 2... Scanning photoelectric conversion device, 3... Binary quantization device, 4... Pattern storage device, 5... Circumscribing frame detection device, 6 _L , 6 _R , 6 _U ,
6 _D ...Scanning circuit, 7 _L , 7 _R , 7 _U , 7 _D ... Center of gravity calculation unit, 8... Feature comparison device, 9... Feature dictionary.

Claims

[Scope of Claims] 1. Means for photoelectrically converting and inputting the original character pattern, means for binary quantizing and storing the input original character pattern, and 4-direction conversion of the stored original character pattern in four directions: top, bottom, left and right. means for detecting a circumscribed frame consisting of circumscribed edges; means for detecting the center of gravity or center position of a character background portion in a direction from each of the circumscribed edges toward the center of the original character pattern; A character reading device comprising means for broadly classifying and identifying the original character pattern according to the relative position of the center of gravity or center position of the character background portion with respect to the circumscribing frame. 2. The means for detecting the center of gravity or center position of the character background area is to scan the inside of the circumscribing frame from each circumscribing edge toward the center of the original character pattern, and to calculate the scanning distance from each circumscribing edge to the character line. A character reading device according to claim 1, wherein the character reading device is made by: . 3. The character reading device according to claim 1, wherein the relative position of the center of gravity or center position of the character background corresponding to each circumscribed side with respect to the circumscribed frame is determined as normalized position data. 4. Broad classification identification of the original character pattern is performed using the data of each set of relative positions with respect to the center of gravity or the central circumscribing frame of the character background corresponding to each circumscribing edge as characteristic information of the surrounding area of the original character pattern. A character reading device according to claim 1.