JPS6326915B2

JPS6326915B2 -

Info

Publication number: JPS6326915B2
Application number: JP55187975A
Authority: JP
Inventors: Hideaki Sugawara
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1980-12-29
Filing date: 1980-12-29
Publication date: 1988-06-01
Also published as: JPS57113184A

Description

【発明の詳細な説明】本発明は文字認識方法に係り特に手書の文字認
識方法に於て辞書と比較される手書文字の内の類
似した文字例えば偏をマスクしてマスクされた以
外の文字、例えば旁を辞書と比較して文字認識を
行うようにしたものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a character recognition method, and in particular, in a handwritten character recognition method, it is possible to identify similar characters among handwritten characters compared with a dictionary, such as by masking the bias. This system performs character recognition by comparing characters, such as 时, with a dictionary.

従来、文字等のパターンを認識する方法として
はシートの枠内に書かれた文字を光学読取装置に
より読みとり、光電変換された信号を枠検出回路
に加えて枠を検出し、更に切出回路によつて文字
背景（白部分）に対応した“１”“０”の２値化
信号に変換し、特徴抽出回路により手書文字の特
徴を抽出する。一方、特徴別のカテゴリに区分
し、特徴の抽出された２値信号化した辞書と上記
手書の２値化し、特徴抽出した信号とを比較し、
比較した値が近いものを何種類か選択し、例えば
比較値が最も小さいものを枠内に書かれた文字で
あると認識する方法が知られている。 Conventionally, the method for recognizing patterns such as characters is to read the characters written within the frame of a sheet using an optical reader, apply the photoelectrically converted signal to a frame detection circuit to detect the frame, and then send it to a cutting circuit. Therefore, it is converted into a binary signal of "1" and "0" corresponding to the character background (white part), and the feature of the handwritten character is extracted by a feature extraction circuit. On the other hand, the dictionary is divided into categories according to features and is converted into a binary signal from which features are extracted, and the above-mentioned handwriting is converted into a binary signal and features are extracted.
A known method is to select several types of characters whose compared values are similar, and to recognize, for example, the character with the smallest comparison value as a character written in a frame.

上述の如き比較操作を以下に於てはマツチング
と記して説明を進める。 The comparison operation as described above will be referred to as matching in the following description.

即ち、従来の構成に於てはこの様なマツチング
は第１図に示すように枠１内の手書文字３を一様
な寸法で切り出し２を行つて、枠内のすべての領
域を均一にマツチングしていた。 That is, in the conventional configuration, such matching is performed by cutting out the handwritten characters 3 in the frame 1 with uniform dimensions as shown in FIG. They were matching.

このため、第１図に示すような偏と旁より成る
漢字等よりなる手書文字に於ては少なくとも偏の
部分が共通である文字は辞書内に多く存在しこれ
らの比較出力差は少ないため、認識精度が悪くな
る欠点を有していた。 For this reason, in the case of handwritten characters such as kanji consisting of a bias and a 旁 as shown in Figure 1, there are many characters in the dictionary that have at least a common bias, and the difference in comparison output between these characters is small. , which had the disadvantage of poor recognition accuracy.

本発明は上述の欠点を除いた文字認識方法を提
供しようとするものであり、その特徴とするとこ
ろは同じような偏を持つようなカテゴリの辞書を
選択し、選択された候補カテゴリの辞書に対して
偏部分はマスクし、旁部分のみをマツチングする
ように文字認識を行つて同じような偏を持つ文字
の群よりの認識率を向上させることにある。以下
本発明の１実施例を第２図乃至第４図について詳
記する。 The present invention attempts to provide a character recognition method that eliminates the above-mentioned drawbacks, and its feature is to select dictionaries of categories that have similar biases, and to apply the same bias to the dictionaries of the selected candidate categories. On the other hand, the purpose is to mask the biased parts and perform character recognition to match only the blank parts, thereby improving the recognition rate compared to a group of characters with similar biases. An embodiment of the present invention will be described in detail below with reference to FIGS. 2 to 4.

第２図は本発明の系統図を示すもので手書の文
字３が枠１内に書かれた文書４をビジコン５で撮
像し、枠１内の文字３を読み出し、光電変換し切
出し回路６により切り出しを行ない次段の正規化
回路７に加えられる。該正規化回路では個人の癖
によつて字の大きさはまちまちであるため大きす
ぎる字の縮小と線巾の細め及びノイズ除去等が行
なわれる。正規化されたパターンは特徴抽出回路
８により複数特徴に分類して抽出する。 FIG. 2 shows a system diagram of the present invention, in which a document 4 in which handwritten characters 3 are written in a frame 1 is imaged by a vidicon 5, the characters 3 in the frame 1 are read out, photoelectrically converted, and an extraction circuit 6 The extracted signal is extracted and added to the normalization circuit 7 at the next stage. Since the size of characters varies depending on individual habits, the normalization circuit reduces excessively large characters, narrows the line width, and removes noise. The normalized pattern is classified into a plurality of features and extracted by the feature extraction circuit 8.

このように抽出され２値化された特徴列は予め
辞書１０にたくわえられている２値化された文字
と順次比較、即ちマツチング９が行われる。かく
して枠１内の手書文字に類似の文字が該辞書より
認識１１される。この場合、上記認識によつて辞
書１０より選択された複数候補文字をマスク処理
回路１２でマスクし、第２の認識１３を行なう様
に成される。 The feature string extracted and binarized in this way is sequentially compared with the binarized characters stored in the dictionary 10 in advance, that is, matching 9 is performed. Thus, characters similar to the handwritten characters in frame 1 are recognized 11 from the dictionary. In this case, the plurality of candidate characters selected from the dictionary 10 by the above recognition are masked by the mask processing circuit 12, and the second recognition 13 is performed.

即ち本発明はビヂコンよりの２次元パターンに
対し、２次元的に生成された特徴列全領域でマツ
チングを行い類似カテゴリの辞書を選択し、選択
されたカテゴリの辞書内で入力２次元パターンに
対して枠内パターンの左半分又は右半分のマツチ
ングを行うもので左右いづれかのマツチングで全
領域のマツチングよりも類似カテゴリの辞書内で
平均して明らかに類似していれば類似している偏
又は旁をマスクし、平均して非類似であればマス
ク領域の設定を行なわない様にしたもので、今こ
れらをマスク設定について金偏の集合である
「鐘」「鉄」「銅」について考えると３つの集合の
違いは旁の「童」「失」「同」の部分の違いは大き
く金偏の部分は書き方の違いがあつたとしてもそ
の違いは小さい。よつて偏の部分にマスクをし、
旁の部分のみマツチングを行うようにする。 That is, the present invention performs matching on the two-dimensional pattern from the videocon in the entire area of the two-dimensionally generated feature sequence, selects a dictionary of similar categories, and matches the input two-dimensional pattern within the dictionary of the selected category. It is a method that matches the left half or right half of a pattern within a frame, and if matching on either the left or right side is clearly similar on average within the dictionary of similar categories than matching on the entire area, it is considered to be similar. , and if it is dissimilar on average, the mask area is not set.If we consider these mask settings for gold-biased sets such as "bell,""iron," and "copper," we get 3. The difference between the two sets is that the ``dou'', ``lost'', and ``dou'' parts of 旁 are different, and the difference in the way of writing the kinbia part is small, even if there is a difference. Then put a mask on the uneven part,
Make sure to match only the straw parts.

以下、マスキング方法について第３図及び第４
図を参照して説明する。第３図の如く金偏よりな
る３つのパターンの左半分はマスク１４の“０”
で示す領域で、この部分にマスクを選択した場合
はマツチングを行なわずマスク１４の“１”で示
す領域のみマツチングを行なう。即ち第４図はマ
スク処理回路１２部分の本発明の実施例を示すも
のであり、文書４の枠１内に手書きの「鐘」の文
字３が画かれ、更に認識１１された金偏を含む類
似の文字「鉄」「銅」「鐘」等が１位、２位、３位
の順で選択されたとすると、１位の「鉄」と枠１
内の手書の文字「鐘」にはマスク１４，１４を与
える。この場合マスクは金偏部分（左半分）が前
記した様にその違いが少いことを認識（後述する
式１の距離）した時、金偏部分は“０”の２値類
で表される様な形とし、旁部分「童」と「失」部
分の比較を行なう、即ちマツチング領域１５，１
６で旁のみのマツチングが行なわれる。このマツ
チングは「童」「失」部分、即ち入力と辞書の値
をＡ、Ｂとすると差の絶対値演算回路１７によつ
て｜Ａ−Ｂ｜を求め差の絶対値加算回路１８によ
り距離α（AB）が次の(1)式を満す様に演算する α（AB）＝_n 〓^k=1 ｜Ａ−Ｂ｜ ………(1) この演算を第１の認識１で選択されたｋ個の文
字について行なう。但しｎは所定のビツト数で表
した特徴抽出数とする。即ち、次は２位の銅の旁
「同」と入力の旁「童」がマツチングされ、次は
３位の「鐘」の旁「童」と入力の旁「童」がマツ
チングされ、これらすべての距離αが最小値演算
回路１９で求められ、最小の値を入力の認識とし
て出力端子２０に出力する。 Below, the masking method is shown in Figures 3 and 4.
This will be explained with reference to the figures. As shown in Figure 3, the left half of the three patterns made of gold is "0" of the mask 14.
If a mask is selected for this part in the area shown by , no matching is performed and only the area shown by "1" of the mask 14 is matched. That is, FIG. 4 shows an embodiment of the present invention of the mask processing circuit 12 portion, in which a handwritten character 3 for "bell" is drawn in a frame 1 of a document 4, and further includes a recognized gold mark 11. If similar characters ``tetsu'', ``bronze'', ``bell'', etc. are selected in the order of 1st, 2nd, and 3rd place, the 1st place ``tetsu'' and frame 1 are selected.
Give masks 14, 14 to the handwritten character "bell" inside. In this case, when the mask recognizes that the difference in the gold-biased part (left half) is small as described above (distance in equation 1 described later), the gold-biased part is represented by a binary class of "0". A comparison is made between the ``child'' part and the ``lost'' part, that is, the matching area 15, 1.
6, only matching is performed. This matching is performed for the "child" and "lost" parts, that is, if the input and dictionary values are A and B, the absolute value calculation circuit 17 calculates |A-B|, and the absolute value addition circuit 18 calculates the distance α. Calculate (AB) so that it satisfies the following formula (1) α (AB) = _n 〓 ^k=1 |A-B| Do this for k characters. However, n is the number of extracted features expressed in a predetermined number of bits. That is, next, the 2nd place bronze ``dou'' is matched with the input ``dou'', and then the 3rd place ``bell''``dou'' is matched with the input 时 ``dou'', and all of these are matched. The distance α is determined by the minimum value calculation circuit 19, and the minimum value is output to the output terminal 20 as input recognition.

上述の説明では偏の距離αが入力と第１の認識
１１で得た候補パターンとの間で小さい時に偏を
マスクし、旁のみをマツチングさせたが入力と候
補パターンの偏が似ていない時は即ち距離αが大
きい時は旁もマツチングさせることは明らかであ
る。旁の距離が入力と候補パターンとの間で小さ
い時は右半分にマスクを掛けて左半分のみマツチ
ングさせる様にすればよい。本発明は上述の如く
方法によりパターン認識を行つたので認識精度を
より高めることが出来て認識率を向上させること
が出来る特徴を有するものである。 In the above explanation, when the bias distance α is small between the input and the candidate pattern obtained in the first recognition step 11, the bias is masked and only the half is matched, but when the bias between the input and the candidate pattern is not similar In other words, it is clear that when the distance α is large, the time is also matched. When the distance between the input and the candidate pattern is small, the right half may be masked and only the left half may be matched. Since the present invention performs pattern recognition using the method described above, it is characterized in that recognition accuracy can be further improved and recognition rate can be improved.

[Brief explanation of the drawing]

第１図は従来のパターン切出方法を説明するた
めのパターン平面図、第２図は本発明のパターン
等の文字認識方法を示す系統図、第３図は本発明
の文字認識方法を示すマスク説明用パターンの平
面図、第４図は本発明のマスク処理回路部分の系
統的説明図である。１……枠、２……切り出し、３……文字、４…
…文書、５……ビデコン、６……切出回路、７…
…正規化回路、９……特徴抽出回路、９……マツ
チング回路、１０……辞書、１１，１３……認識
領域、１２……マスク処理回路、１４……マス
ク、１７……絶対値演算回路、１８……絶対値加
算回路、１９……最小値演算回路。 Fig. 1 is a pattern plan view for explaining the conventional pattern cutting method, Fig. 2 is a system diagram showing the method for recognizing characters such as patterns of the present invention, and Fig. 3 is a mask showing the character recognition method of the present invention. FIG. 4, which is a plan view of the explanatory pattern, is a systematic explanatory diagram of the mask processing circuit portion of the present invention. 1...Frame, 2...Cut out, 3...Character, 4...
...document, 5...videocon, 6...cutout circuit, 7...
... Normalization circuit, 9 ... Feature extraction circuit, 9 ... Matching circuit, 10 ... Dictionary, 11, 13 ... Recognition area, 12 ... Mask processing circuit, 14 ... Mask, 17 ... Absolute value calculation circuit , 18... Absolute value addition circuit, 19... Minimum value calculation circuit.

Claims

[Claims] 1. A plurality of features of the handwritten characters 3 in the frame 1 read by an optical reading device are compared with a plurality of features of the characters stored in a dictionary 10 arranged in categories according to features. In a character recognition method for recognizing characters, in the first recognition 11, the handwritten characters 3 within the frame 1 are
The character features in all areas are matched with the character features in all areas stored in the dictionary 10, and based on the results, similar character parts and dissimilar character parts are recognized. A mask is provided for the character part, and only the dissimilar character part is recognized by the dictionary 10 in the second recognition 13.
A character recognition method characterized in that the handwritten character 3 in the frame 1 is recognized by comparing it with a character part stored in the frame 1.