JPH0474755B2

JPH0474755B2 -

Info

Publication number: JPH0474755B2
Application number: JP58034482A
Authority: JP
Priority date: 1983-03-04
Filing date: 1983-03-04
Publication date: 1992-11-27
Also published as: JPS59161784A

Description

【発明の詳細な説明】（技術分野）本発明は文字認識に関し、更に具体的にはアル
フアベツト、数字、カタカナ等に関するオンライ
ン文字認識に関するものである。TECHNICAL FIELD The present invention relates to character recognition, and more particularly to online character recognition for alphabets, numbers, katakana, and the like.

（背景技術）従来のオンライン文字認識装置は、オンライン
であるが故の情報を用いて構成されてきた。即
ち、ストロークそのものの情報とストロークの順
を示す情報である。これら２つの情報を用いて、
各ストロークの特徴をこと細かに辞書に持ち、入
力されたストロークの特徴より辞書を参照して認
識するという、いわゆるストロークアナリシス法
であつた。(Background Art) Conventional online character recognition devices have been constructed using information that is available online. That is, the information includes information about the stroke itself and information indicating the order of the strokes. Using these two pieces of information,
It was a so-called stroke analysis method in which the characteristics of each stroke were stored in a dictionary in detail, and the characteristics of the input strokes were recognized by referring to the dictionary.

ストロークアナリシス法の場合、アルフアベツ
ト、数字、カタカナ等（以下ANK等という）
100文字程度の場合でも、それらの文字の特徴を
各ストローク毎に各文字の変形まで含めて辞書に
記述するため辞書が膨大となり、従つて処理時間
が大となり、処理時間を短かくしようとするとハ
ードウエア規模が大となるという欠点があつた。
また、これらの欠点を補うためにはストローク数
や筆順に制約をつけざるを得ず、これら字体や画
数や筆順に対する制約がまた大きな欠点として筆
記者に不便を強いてきたものである。 In the case of the stroke analysis method, alphabets, numbers, katakana, etc. (hereinafter referred to as ANK, etc.)
Even in the case of around 100 characters, the dictionary becomes enormous because the characteristics of those characters are described in the dictionary for each stroke, including the transformation of each character, and therefore the processing time becomes large.If you try to shorten the processing time, The drawback was that the hardware scale was large.
In addition, in order to compensate for these drawbacks, it is necessary to impose restrictions on the number of strokes and the order of strokes, and these restrictions on the font, number of strokes, and order of strokes are also major drawbacks that have forced scribes to inconvenience.

（発明の課題）本発明の目的はこれらの欠点を除去するため、
ストロークアナリシス法に依らず、文字全体を図
形としてとらえ、筆記された文字内にストローク
線分がどのように分布しているかを識別すること
により、筆記された文字を認識するオンライン文
字認識における大分類法を提供しようとするもの
で、以下詳細に説明する。(Problem to be solved by the invention) The purpose of the present invention is to eliminate these drawbacks.
A major classification of online character recognition that recognizes written characters by considering the entire character as a figure and identifying how stroke line segments are distributed within the written character, without relying on the stroke analysis method. It is intended to provide a legal framework and will be explained in detail below.

（発明の構成および作用）第１図は本発明の一実施例を示すブロツク図で
あつて、１はタブレツト、２はアナログ・デイジ
タル変換器（以下Ａ／Ｄという）、３は入力レジ
スタ（以下IRという）、４は演算用レジスタを含
んだ演算器（以下ALUという）、５は部分値レジ
スタ（以下PRという）、６は置数レジスタ（以下
ARという）、７はリフアレンスレジスタ（以下
RRという）、８は比較器（以下COMPという）、
９は候補レジスタ（以下SRという）より構成さ
れ、筆記者がタブレツト１に記載した文字はＡ／
Ｄ２によりデイジタル化され、このデータはIR
３に順次格納される。IR３には、タブレツト１
のサンプルレートに応じてデータが格納される
が、格納されたデータはタブレツト１の絶対座標
を示すデータである。Ａ／Ｄ２よりIR３へのデ
ータの格納と同時に時分割的にIR３のデータは
ALU４へ送られ、一字枠内の文字として最初に
記載されたストローク始点を中心とする相対的な
座標系に変換され、更に同一点は削除される。こ
れは、筆記者が筆記具をタブレツト１の同一位置
に置いていた場合、同一点が入力され後の処理を
複雑にするのを避けるためである。(Structure and operation of the invention) FIG. 1 is a block diagram showing an embodiment of the present invention, in which 1 is a tablet, 2 is an analog-to-digital converter (hereinafter referred to as A/D), and 3 is an input register (hereinafter referred to as A/D). 4 is an arithmetic unit containing arithmetic registers (hereinafter referred to as ALU), 5 is a partial value register (hereinafter referred to as PR), and 6 is a position register (hereinafter referred to as PR).
7 is a reference register (hereinafter referred to as AR), 7 is a reference register (hereinafter referred to as
8 is a comparator (hereinafter referred to as COMP),
9 consists of candidate registers (hereinafter referred to as SR), and the characters written by the scribe on tablet 1 are A/
digitized by D2 and this data is IR
3 are sequentially stored. IR3 has tablet 1
The data is stored according to the sampling rate of the tablet 1, and the stored data is data indicating the absolute coordinates of the tablet 1. At the same time as data is stored from A/D2 to IR3, IR3 data is stored in a time-sharing manner.
It is sent to the ALU 4 and converted into a relative coordinate system centered on the stroke start point first written as a character within a character frame, and the same points are deleted. This is to avoid complicating subsequent processing due to input of the same point if the scribe places the writing instrument at the same position on the tablet 1.

なお、ストロークの始点及び終点は、IR３，
PR５の内部において、それぞれどのデータが各
ストロークの始点なのか、または終点なのかが判
断できるようにマーキングされているものとす
る。このようにして、PR５に格納されたデータ
はALU４を用いて前処理が施される。前処理は
ノイズ除去と平滑化よりなるが、本発明の主題で
はない。 Note that the start and end points of the stroke are IR3,
It is assumed that markings are made inside PR5 so that it can be determined which data is the start point or end point of each stroke. In this way, the data stored in PR5 is preprocessed using ALU4. Preprocessing consists of denoising and smoothing, but is not the subject of this invention.

以上説明したようにタブレツト１より入力され
たデータ群は、同一点除去、ノイズ除去、平滑化
が施され、PR５に格納される。数字「６」をタ
ブレツト１に記載し、PR５に格納されたデータ
群を各データの生起順に直線で結んだものが第２
図である。 As explained above, the data group input from the tablet 1 is subjected to identical point removal, noise removal, and smoothing, and is stored in the PR5. The number "6" is written on tablet 1, and the data group stored in PR5 is connected with a straight line in the order of occurrence of each data.
It is a diagram.

第２図の図形より(1)始点、(2)終点、(3)ｘ，ｙの
ストローク方向の変る点（以下極点という）、(4)
第２図の図形を45゜回転させた時のｘ，ｙの極点
を代表点としてとりあげ、これら代表点間の距離
が近い場合には、近い代表点を無視することで第
３図の図形を得る。第３図は、前記PR５に格納
されたデータより前述の処理を施した結果のデー
タ群を各データの生起順に直線で結んだものであ
り、結果のデータ群はPR５に置かれる。 From the figure in Figure 2, (1) the starting point, (2) the ending point, (3) the point where the x and y stroke direction changes (hereinafter referred to as the extreme point), (4)
When the figure in Figure 2 is rotated 45 degrees, the x and y poles are taken as representative points, and if the distances between these representative points are close, the figure in Figure 3 can be created by ignoring the nearby representative points. obtain. FIG. 3 shows a data group resulting from the above-described processing performed on the data stored in the PR5, which is connected by a straight line in the order of occurrence of each data, and the resulting data group is placed in the PR5.

さて、ANKに相当する文字はこれらを図形と
してみた場合、比較的ストロークの存在する位置
が特徴的であり、各ストローク個々の特徴を見る
よりも文字全体を図形としてとらえ、ストローク
が存在する位置がどこであるかという観点より見
ることを提案するものであり、以下詳細に説明す
る。なお、第３図の代表点間を結ぶ直線を以下セ
グメントと呼ぶ。また、説明文はすべてｘ，ｙ方
向ともに正規化されたデータに関して説明してい
るものとする。 Now, when characters corresponding to ANK are viewed as figures, the position of the strokes is relatively characteristic. This is a proposal to look at it from the perspective of where it is, and will be explained in detail below. Note that the straight line connecting the representative points in FIG. 3 is hereinafter referred to as a segment. Furthermore, it is assumed that all the explanatory texts describe data that has been normalized in both the x and y directions.

先ず第１の特徴として、記載された全ストロー
クの線長を用いる。Hx，Hyを第３図に示すよう
に、ｘ，ｙ方向の代表点の内で最大の座標値とす
ると、 Q₁＝１／Hx_o 〓ⁱ⁼¹ ｜x_i−x_i+1｜ Q₂＝１／Hy_o 〓ⁱ⁼¹ ｜y_i−y_i+1｜により表わされる値を用いる。すなわち、Q₁は
代表点を結んで得られる線分のｘ成分の総和を正
規化したものであり、Q₂は同じ線分のｙ成分の
総和を正規化したものである。これらQ₁及びQ₂
は、筆記具がｘ方向、ｙ方向に動いた距離を、字
の大きさに対する割合で示したものである。 First, the line length of the entire stroke described is used as the first feature. As shown in Figure 3, if Hx and Hy are the maximum coordinate values among the representative points in the x and y directions, then Q ₁ = 1/Hx _o 〓 ⁱ⁼¹ ｜x _i −x _i+1 ｜ Q The value expressed by ₂ = 1/Hy _o 〓 ⁱ⁼¹ |y _i −y _i+1 | is used. That is, Q ₁ is the normalized sum of the x components of the line segments obtained by connecting the representative points, and Q ₂ is the normalized sum of the y components of the same line segments. These Q ₁ and Q ₂
represents the distance that the writing instrument has moved in the x and y directions as a ratio to the size of the character.

第２の特徴として、各セグメントが存在する位
置を用いる。ここでいうセグメントとは、代表点
どうしを字に沿つた所定の順序で結んで得られる
線分を指すものである。存在する位置として、あ
るセグメントがｘ，ｙ方向のどの場所に存在する
かをｘ，ｙ成分別に数値化し、その平均的な値を
算出する。ｘ成分のｘ軸上の位置については、 Q₃＝_o 〓〓ⁱ⁼¹ ｛｜x_i−x_i+1｜／Hx×１／Hx・x_i＋x_i+1｜／２｝＝
１／2Hx² _o 〓ⁱ⁼¹ ｜x_i ²−x_i+1｜ｙ成分のｙ軸上の位置についても同様に、 Q₄＝１／2Hy² _o 〓ⁱ⁼¹ ｜y_i ²−y_i+1｜により表わされる。ｘ成分のｙ軸上の位置につい
ては、 Q₅＝１／2Hx・Hy_o 〓ⁱ⁼¹ ｛｜x_i−x_i+1｜・（y_i＋y_i+1）｝で、またｙ成分のｘ軸上の位置については、 Q₆＝１／2Hx・Hy_o 〓ⁱ⁼¹ ｛｜y_i−y_i+1｜・（x_i＋x_i+1）｝で表わされる。すなわち、Q₃は字における横画
が右寄りに分布するか、左寄りに分布するかを示
し、Q₄は縦画が上寄りに分布するか、下寄りに
分布するかを示す。同様に、Q₅は横画が上寄り
に分布するか下寄りに分布するかを示し、Q₆は
縦画が左寄りに分布するか右寄りに分布するかを
示す。 The second feature is the position where each segment exists. The term "segment" here refers to a line segment obtained by connecting representative points in a predetermined order along the character. The position where a certain segment exists in the x and y directions is digitized for each x and y component, and the average value thereof is calculated. Regarding the position of the x component on the x-axis, Q ₃ = _o 〓〓 ⁱ⁼¹ {|x _i −x _i+1 |/Hx×1/Hx・x _i +x _i+1 |/2}=
1/2Hx ² _o 〓 ⁱ⁼¹ ｜x _i ² −x _i+1 ｜ Similarly for the position of the y component on the y axis, Q ₄ = 1/2Hy ² _o 〓 ⁱ⁼¹ ｜y _i ² −y It is represented by _i+1 | Regarding the position of the x component on the y axis, Q ₅ = 1/2Hx・Hy _o 〓 ⁱ⁼¹ {|x _i −x _i+1 |・(y _i +y _i+1 )}, and the position of the y component The position on the x-axis is expressed as Q ₆ =1/2Hx·Hy _o 〓 ⁱ⁼¹ {|y _i −y _i+1 |·(x _i +x _i+1 )}. That is, Q ₃ indicates whether the horizontal strokes in a character are distributed toward the right or left, and Q ₄ indicates whether the vertical strokes are distributed toward the top or bottom. Similarly, Q ₅ indicates whether horizontal strokes are distributed toward the top or bottom, and Q ₆ indicates whether vertical strokes are distributed toward the left or right.

具体例をあげるなら、「理」という字に関して
は、横画が多いためにQ₁の値が大きくなり、ま
た、横画が文字の右寄りに分布しているため、
Q₃の値が大きくなる。これに対し、「川」という
字に関しては、ｘ成分が第１画のわずかな湾曲の
みのため、Q₁は非常に小さな値をとる。更に、
画数そのものが少ないため、Q₂も比較的小さな
値をとる。また、「優」という字に関しては、字
の右側が複雑であるため、Q₃及びQ₅が大きな値
をとる。特に、右側に横画が多いため、Q₃が大
きな値をとる。 To give a specific example, for the character ``Ri'', the value of _Q1 is large because there are many horizontal strokes, and because the horizontal strokes are distributed towards the right side of the character,
The value of Q ₃ increases. On the other hand, for the character "river", the x component is only a slight curvature of the first stroke, so Q ₁ takes a very small value. Furthermore,
Since the number of strokes itself is small, Q ₂ also takes a relatively small value. In addition, regarding the character "Yu", Q ₃ and Q ₅ take large values because the right side of the character is complex. In particular, _Q3 takes a large value because there are many horizontal strokes on the right side.

上記の例はすべて漢字で、縦画と横画について
のみ例示したが、上記の処理は代表的な座標値か
らセグメントのｘ成分、ｙ成分を求めて行なうも
のであるため、「ひらがな」や「カタカナ」に多
くみられる曲線や斜めの画も処理の対象となる。 All of the above examples are kanji, and only vertical strokes and horizontal strokes are illustrated, but since the above processing is performed by finding the x and y components of the segment from representative coordinate values, it can be used for ``hiragana'' or ``hiragana''. Curved lines and diagonal strokes, which are often found in Katakana, are also subject to processing.

上記Q₃〜Q₆は第３図でも明らかなごとく、記
載された文字に外接する長方形の左下を原点とす
る座標系により演算を行なう。従つて、各セグメ
ントのｘ，ｙ成分の存在する位置は、ｙ軸上の位
置については下にくるほど、ｘ軸上の位置につい
ては左にくるほどその特徴量は小さくなり、本発
明の方法では全セグメントの位置に関わる重みを
すべて加算するため、原点に近い部分でのセグメ
ントの変化は小さな差となつて現われ、文字に外
接する長方形上での右上の部分でのセグメントの
変化により左右されることとなる。従つて第４図
に示すごとく、文字に外接する長方形の右上を原
点とする座標系での前記Q₃〜Q₆に関する演算を
行ない、その結果をQ₇〜Q₁₀とする。 As is clear from FIG. 3, the above Q ₃ to _{Q 6} are calculated using a coordinate system whose origin is the lower left of a rectangle circumscribing the written character. Therefore, the position where the x and y components of each segment are located becomes smaller as the position on the y-axis is lower, and as the position on the x-axis is further left, the feature amount becomes smaller. Since all the weights related to the positions of all segments are added, changes in segments near the origin appear as small differences, and are influenced by changes in segments in the upper right part of the rectangle circumscribing the character. The Rukoto. Therefore, as shown in FIG. 4, calculations regarding the above-mentioned Q ₃ to _{Q 6} are performed in a coordinate system whose origin is the upper right corner of a rectangle circumscribing the character, and the results are set as Q ₇ to _{Q 10} .

具体的には、 P_i＝x_i−Hx，q_i＝y_i−Hyなる座標変換について、 Q₇＝１／2Hx² _o 〓ⁱ⁼¹ ｜p_i ²−p_i+1 ²｜ Q₈＝１／2Hy² _o 〓ⁱ⁼¹ ｜q_i ²−q_i+1 ²｜ Q₉＝１／2Hx・Hy_o 〓ⁱ⁼¹ ｛｜p_i−p_i+1｜・（q_i＋q_i+1）｝ Q₁₀＝１／2Hx・Hy_o 〓ⁱ⁼¹ ｛｜q_i−q_i+1｜・（p_i＋p_i+1）｝となる。これらQ₇〜Q₁₀は、Q₃〜Q₆に座標変換を
行なつただけであるので、各々の図形的意味は、
Q₃〜Q₆と同じである。ただし、Q₇〜Q₁₀の演算に
おいては、前記の如く右上が原点になるため、字
の右上の部分でのセグメント変化の影響が小さく
なる。従つて、Q₃〜Q₆とQ₇〜Q₁₀を比較すること
で、字の左下の部分でのセグメントの比較的大き
な変化と、右上の部分での比較的小さな変化を区
別することができる。従つて、Q₇〜Q₁₀を用いな
い場合に比べて、より詳細な分類が可能になる。
なお、上記の演算は、複数ストロークの文字に対
してはその全ストロークの全セグメントに関して
演算を行なう。 Specifically, for the coordinate transformation P _i = x _i −Hx, q _i = y _i −Hy, Q ₇ = 1/2Hx ² _o 〓 ⁱ⁼¹ | p _i ² −p _i+1 ² | Q ₈ =1/2Hy ² _o 〓 ⁱ⁼¹ |q _i ² −q _i+1 ² | Q ₉ =1/2Hx・Hy _o 〓 ⁱ⁼¹ {|p _i −p _i+1 |・(q _i +q _{i +1} )} Q ₁₀ = 1/2Hx・Hy _o 〓 ⁱ⁼¹ {|q _i −q _i+1 |・(p _i +p _i+1 )}. Since these Q ₇ to _{Q 10} are simply coordinate transformed to Q ₃ to _{Q 6} , their graphical meaning is
Same as Q ₃ to Q ₆ . However, in the calculations of Q ₇ to _{Q 10} , the upper right is the origin as described above, so the influence of segment changes in the upper right portion of the character is reduced. Therefore, by comparing Q ₃ to _{Q 6} and Q ₇ to _{Q 10} , we can distinguish between a relatively large change in the segment in the lower left part of the character and a relatively small change in the upper right part of the character. . Therefore, more detailed classification is possible than when _Q7 to _Q10 are not used.
Note that the above calculation is performed on all segments of all strokes for a character with multiple strokes.

以上説明したQ₁〜Q₁₀の式を用いて、PR５の
データはALU４により演算され、AR６に格納さ
れる。RR７には、各種の変形をも含めた各文字
毎のQ₁〜Q₁₀の最大値、最小値が格納されてい
る。AR６に格納された入力文字の特徴Q₁〜Q₁₀
は、RR７の各文字の特徴とCOMP８で比較され
る。RR７には最小値と最大値が格納されている
ため、COMP８ではAR６の内容がRR７に格納
されている最小値から最大値の範囲内に入つてい
るか否かがQ₁〜Q₁₀すべてに渡つて比較され、
RR７で示される範囲内にAR６のQ₁〜Q₁₀の全て
が含まれている場合、その時のRR７の数値を与
えた文字はSR９へ送られ、格納される。この結
果、候補文字の１ケがSR９に格納されたことと
なる。このようにして、ANK等約100文字に対
してすべての文字が範囲内に入つているかどうか
をCOMP８により検証し終つた時、SR９に格納
されている文字が大分類された文字群となる。 Using the formulas Q ₁ to Q ₁₀ explained above, the data of PR5 is calculated by ALU4 and stored in AR6. The maximum and minimum values of Q ₁ to Q ₁₀ for each character, including various transformations, are stored in RR7. Characteristics of input characters stored in AR6 Q ₁ ~ _{Q 10}
is compared with the characteristics of each character in RR7 in COMP8. Since the minimum and maximum values are stored in RR7, COMP8 checks whether the contents of AR6 are within the range from the minimum value to _the maximum value stored in _RR7 . compared with
If all of Q ₁ to _{Q 10} of AR6 are included in the range indicated by RR7, the character giving the value of RR7 at that time is sent to SR9 and stored. As a result, one candidate character is stored in SR9. In this way, when COMP8 finishes verifying whether all characters are within the range for about 100 characters such as ANK, the characters stored in SR9 become a group of roughly classified characters.

この後は、公知の方法、例えばストロークアナ
リシス法により各文字を認識することができる。 After this, each character can be recognized by a known method, such as a stroke analysis method.

文字数が増加すると、このようにしても明白に
区別できない文字がある。また、筆記される文字
は筆記者により、また時々に応じて標準的な字形
と異なるので、比較すべき座標の特徴量に幅を持
たせており、このため、さらに区別できない文字
が増す。しかしながら、前述のような単純な演算
により候補文字を大幅に減少することができるの
で、すべての文字を従来の方法で認識していた時
に比べて、認識の処理時間が格段に少なくなる。
これによりードウエアに依存する部分が少なくな
り、あるいはストローク数や筆順に制約をつけな
くてもよくなる。 As the number of characters increases, some characters cannot be clearly distinguished even in this way. Furthermore, since the characters that are written differ from the standard character shape depending on the scribe and from time to time, the feature values of the coordinates to be compared have a wide range, which increases the number of characters that cannot be distinguished. However, since the number of candidate characters can be significantly reduced by simple calculations as described above, the recognition processing time is significantly reduced compared to when all characters were recognized using the conventional method.
This reduces the dependence on hardware, and eliminates the need to impose restrictions on the number of strokes or the order of strokes.

この卑近な例として、数字に８を示すとよく理
解できると思う。数字の８は矢印を筆順の方向と
すると、〓〓〓〓〓等各種の書き方をするが、図
形としてみると類似しており、ストローク線分の
分布を示すQ₁〜Q₁₀の値としてはある範囲に入る
ため、ストロークアナリシス法では別々の見方を
しなければならないが、本発明の方法によれば、
１種類の見方だけでどの文字が書かれたとしても
数字８であると識別できる。 I think this can be better understood by showing the number 8 as a familiar example. The number 8 can be written in various ways, such as 〓〓〓〓〓, if the arrow is the direction of the stroke order, but when viewed as a figure, it is similar, and the values of Q ₁ to _{Q 10} , which indicate the distribution of stroke line segments, are In order to fall within a certain range, stroke analysis methods must take different views, but according to the method of the present invention,
No matter which letter is written, it can be identified as the number 8 by just one way of looking at it.

以上詳細に説明したように、本発明は、文字を
図形としてとらえ筆記された文字内にストローク
線分がどのように分布しているかを識別すること
により、筆記された文字を認識する大分類法であ
るため、ストロークアナリシス法と比較してハー
ドウエア、処理時間等の計算コストが小さくてす
み、更に例えば「７」を１ストロークで記載して
も、また２ストロークで記載してもストローク数
に関係なく、また複数ストロークの文字の場合で
も、全ストロークの全セグメントに対する値によ
りQ₁〜Q₁₀の値を求めて比較するため、ストロー
ク順に規制されることがない。即ち、画数、筆順
に左右されないオンライン文字認識における大分
類法を提供することができる。 As explained in detail above, the present invention provides a general classification method for recognizing written characters by considering characters as figures and identifying how stroke line segments are distributed within the written characters. Therefore, compared to the stroke analysis method, the calculation costs such as hardware and processing time are lower, and even if "7" is written as one stroke or two strokes, the number of strokes will be reduced. Regardless, even in the case of a character with multiple strokes, the values of Q ₁ to _{Q 10} are determined and compared based on the values for all segments of all strokes, so the order of strokes is not restricted. That is, it is possible to provide a general classification method for online character recognition that is not affected by the number of strokes or the order of strokes.

前記実施例では、COMP８によりAR６のデー
タがRR７のデータ範囲内に入る文字を候補とし
てSR９に送ると説明したが、前記説明では、Q₁
〜Q₁₀の内１ケでもAR６のデータがRR７に示さ
れる最小値より最大値までの範囲から逸脱すると
候補文字から除外される。これを防ぐために、別
にフリツプフロツプ回路（以下FFという）を設
け、Q₁〜Q₁₀の内初めてRR７の範囲を逸脱した
時はそのFFをセツトし、その後範囲を逸脱しな
い時はそのFFの情報と候補文字コードをSR９へ
格納するようにすると、大分類における候補とし
て、候補なしという状態を少なくすることができ
る。大分類における候補なしという状態は、余程
特殊な文字の書き方をしない限り表われない現象
であるが、そのような場合においても、大分類時
点でリジエクトされるということはない。 In the above embodiment, it was explained that COMP8 sends characters whose data in AR6 falls within the data range of RR7 to SR9 as candidates _.
~Q If even one of _{the 10} data in AR6 deviates from the range from the minimum value to the maximum value shown in RR7, it is excluded from the candidate characters. In order to prevent this, a separate flip-flop circuit (hereinafter referred to as FF) is provided, and when _Q1 to _Q10 deviate from the range of RR7 for the first time, that FF is set, and when the range is not deviated from then on, the information of that FF is set. By storing candidate character codes in SR9, it is possible to reduce the number of candidates in the major classification. The state that there are no candidates in the major classification is a phenomenon that does not appear unless the characters are written in a very special way, but even in such a case, the character will not be rejected at the time of the major classification.

次に、前記と同等大分類におけるリジエクトを
少なくする方法であるが、AR６に入力された入
力文字データに対し、RR７のデータを用いて
COMP８で検証しても候補文字がない場合、RR
７の内容を1/8〜1/4程度の大きさで範囲を増大さ
せて再度検証するという方法も有効である。 Next, there is a method to reduce rejects in the same major classification as above, but by using the data of RR7 for the input character data input to AR6.
If there are no candidate characters even after verifying with COMP8, RR
It is also effective to increase the range of the contents of 7 by about 1/8 to 1/4 and re-verify it.

（発明の効果）以上の説明で明らかなごとく、本発明は文字を
図形としてとらえストローク線分の分布状態を特
徴として用いて文字を認識するものであり、非常
に簡単なハードウエアによりオンライン文字認識
の大分類ができるものである。また、本発明は
ANK等100文字程度に限られるものでなく、ひ
らがなや漢字の部分パターンの認識にも応用でき
るものであることは、当業者であれば容易に理解
できるものであると考える。(Effects of the Invention) As is clear from the above explanation, the present invention recognizes characters by recognizing characters as figures and using the distribution state of stroke line segments as a feature. It is possible to broadly classify the following. Moreover, the present invention
Those skilled in the art will easily understand that the present invention is not limited to recognition of about 100 characters such as ANK, but can also be applied to recognition of partial patterns of hiragana and kanji.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示すブロツク図、
第２図は平滑化後のデータの一例、第３図は第２
図より代表点を抽出したデータの一例、第４図は
文字に外接する長方形の右上に原点をとつたもの
である。１…タブレツト、２…アナログ・デイジタル変
換器、３…入力レジスタ、４…演算用レジスタを
含んだ演算器、５…部分値レジスタ、６…置数レ
ジスタ、７…リフアレンスレジスタ、８…比較
器、９…候補レジスタ。 FIG. 1 is a block diagram showing one embodiment of the present invention;
Figure 2 is an example of data after smoothing, Figure 3 is an example of the data after smoothing.
FIG. 4, an example of data from which representative points are extracted from the figure, has its origin set at the upper right of a rectangle circumscribing a character. DESCRIPTION OF SYMBOLS 1...Tablet, 2...Analog-digital converter, 3...Input register, 4...Arithmetic unit including arithmetic registers, 5...Partial value register, 6...Arrangement register, 7...Reference register, 8...Comparator , 9... Candidate register.

Claims

[Claims]

1. A tablet that generates data indicating the coordinates of a writing instrument when writing a character, and an online character recognition device that recognizes written characters based on the data generated by the tablet, converting analog data from the tablet into digital data. means for temporarily storing the digital data;
After performing preprocessing to determine representative points, calculate the sum of the lengths of line segments obtained by connecting the representative points in a predetermined order along the characters, and the distance between each point on the line segment and the coordinate axis. means for determining the average value of and normalizing the sum of the lengths of the line segments and the average value of the distances, and means for comparing the normalized value with a predetermined value, A character recognition general classification method characterized by classifying written characters according to the length and distribution state of line segments.