Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
JPH0799534B2 - Character figure recognition method - Google Patents
[go: Go Back, main page]

JPH0799534B2 - Character figure recognition method - Google Patents

Character figure recognition method

Info

Publication number
JPH0799534B2
JPH0799534B2 JP62061241A JP6124187A JPH0799534B2 JP H0799534 B2 JPH0799534 B2 JP H0799534B2 JP 62061241 A JP62061241 A JP 62061241A JP 6124187 A JP6124187 A JP 6124187A JP H0799534 B2 JPH0799534 B2 JP H0799534B2
Authority
JP
Japan
Prior art keywords
character
series
pattern
divided
coordinate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP62061241A
Other languages
Japanese (ja)
Other versions
JPS63228390A (en
Inventor
敏行 射手園
浩一 樋口
晃治 伊東
義征 山下
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oki Electric Industry Co Ltd
Original Assignee
Oki Electric Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oki Electric Industry Co Ltd filed Critical Oki Electric Industry Co Ltd
Priority to JP62061241A priority Critical patent/JPH0799534B2/en
Publication of JPS63228390A publication Critical patent/JPS63228390A/en
Publication of JPH0799534B2 publication Critical patent/JPH0799534B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)

Description

【発明の詳細な説明】 (産業上の利用分野) 本発明は媒体上の文字図形を認識する文字図形認識方式
に関するものである。
The present invention relates to a character / figure recognition system for recognizing character / figure on a medium.

(従来の技術) 従来、文字図形認識装置では、文字図形パターンよりス
トロークを抽出し、それら抽出されたストロークの位
置、長さ、ストローク間の相互関係等を用いて認識する
方式が多く採用されている。その手法は(1)文字図形
の輪郭を追跡することにより検出された輪郭点系列につ
いて曲率を計算し、その曲率の大きな値の点を分割点と
して輪郭系列を分割し、分割された系列を組合わせるこ
とによりストロークを抽出するか、(2)文字図形パタ
ーンに細線化処理を行なつて骨格化し、その骨格パター
ンの連結性及び骨格パターンを追跡し急激な角度の変化
点等を検出してストロークを抽出し、前記(1)(2)
より抽出されたストロークについて幾何学的な特徴等を
抽出して識別を行なつていた。
(Prior Art) Conventionally, a character / graphics recognition apparatus has often adopted a method of extracting strokes from a character / graphics pattern and recognizing them by using the positions, lengths, mutual relationships between the strokes, and the like of the extracted strokes. There is. The method is as follows: (1) The curvature is calculated for the contour point series detected by tracing the contour of the character figure, the contour series is divided with the point having a large curvature value as the division point, and the divided series is combined. Strokes are extracted by combining them, or (2) the character / graphic pattern is thinned to form a skeleton, the connectivity of the skeleton pattern and the skeleton pattern are traced, and a sudden angle change point is detected to make a stroke. And extract (1) and (2) above
The strokes thus extracted are identified by extracting geometrical features and the like.

(発明が解決しようとする問題点) しかしながら、前記従来の文字図形認識方式では、次の
ような問題点がある。
(Problems to be Solved by the Invention) However, the conventional character / graphics recognition method has the following problems.

(1)の方式では文字図形パターンが大きくなり、又文
字図形パターンが複雑化すると、その処理量が増大し処
理速度の低下を招いていた。(2)の方式では文字図形
パターンを細線化する必要があり、又その細線化による
パターンのひずみ、ヒゲ等の問題があり、その後の処理
を複雑なものとしていた。
In the method (1), if the character / graphic pattern becomes large and the character / graphic pattern becomes complicated, the processing amount increases and the processing speed decreases. In the method (2), it is necessary to make the character / graphic pattern thin, and there are problems such as pattern distortion and beard due to the thinning, and the subsequent processing is complicated.

本発明は以上述べた問題点を解決し、簡単な処理で高速
に文字図形を認識することが可能な文字図形認識方式を
提供することを目的とする。
SUMMARY OF THE INVENTION It is an object of the present invention to solve the above-mentioned problems and provide a character / graphics recognition method capable of recognizing a character / graphics at high speed with simple processing.

(問題点を解決するための手段) 本発明は前記問題点を解決するために、媒体上の文字図
形を読取って量子化,2値化して、例えば文字線部を黒ビ
ット、背景部を白ビットと表わして得られるパターンを
記憶する記憶手段を備え、前記パターンに基づいて文字
図形を認識する文字図形認識方式において、(a)前記
パターンを走査して文字図形の外接枠を検出する第1の
検出手段、(b)前記パターンを走査して所定の2つの
軸に(例えば2次元座標における水平方向、垂直方向。
以下これらをX軸、Y軸という。)投影した各軸方向の
黒ビツト数分布を作成する作成手段、(c)前記2つの
軸方向の前記外接枠内の範囲で各黒ビツト数分布の重心
座標を決定し、決定した各重心座標で外接枠内の範囲を
分割した各分割範囲に対し各黒ビツト数分布の重心座標
を決定する過程を繰り返して各軸方向の重心座標系列を
検出する第2の検出手段、(d)前記重心座標系列に基
づいて各軸方向の分割座標系列を決定する決定手段、
(e)前記分割座標系列で分割される前記外接枠の各領
域の長さを対応する軸方向の外接枠の長さで正規化した
各軸方向の分割領域長系列を計算する計算手段、及び
(f)前記分割領域長系列と予め計算された標準パター
ンの分割領域長系列とを照合して前記パターンの文字図
形を認識する認識手段を具備するものである。
(Means for Solving Problems) In order to solve the above problems, the present invention reads a character graphic on a medium, quantizes it, and binarizes it. For example, a character line part is a black bit and a background part is white. In a character / figure recognition system for recognizing a character / figure based on the pattern, comprising: a storage unit for storing a pattern obtained as a bit; (a) a first pattern for scanning the pattern to detect a circumscribing frame of the character / figure; (B) scanning the pattern on two predetermined axes (for example, a horizontal direction and a vertical direction in a two-dimensional coordinate).
Hereinafter, these are referred to as X axis and Y axis. ) Creating means for creating a projected black bit number distribution in each axial direction, (c) Determining the barycentric coordinates of each black bit number distribution within the range of the circumscribing frame in the two axial directions, and the determined barycentric coordinates Second detection means for detecting the barycentric coordinate series in each axial direction by repeating the process of determining the barycentric coordinates of each black bit number distribution for each of the divided ranges obtained by dividing the range in the circumscribing frame, (d) the barycenter Determination means for determining a division coordinate series in each axis direction based on the coordinate series,
(E) Calculation means for calculating a divided area length series in each axial direction in which the length of each area of the circumscribed frame divided by the divided coordinate series is normalized by the length of the corresponding circumscribed frame in the axial direction, and (F) A recognition means is provided for recognizing the character pattern of the pattern by collating the divided area length series with the previously calculated standard area divided area length series.

(作用) 本発明によれば、以上のように文字認識方式を構成した
ので、技術的手段は次のように作用する。記憶手段に格
納されたパターンを走査することによつて、第1の検出
手段では文字図形の外接枠(文字枠)が検出され、作成
手段では各軸方向(例えばX軸,Y軸方向)の黒ビツト数
分布が作成される。このようにして得られた外接枠及び
各黒ビツト数分布に基づいて、第2の検出手段で各軸方
向の重心座標系列が検出される。検出された重心座標系
列に基づいて、決定手段により各軸方向の分割座標系列
が決定される。例えば、文字図形の複雑度に応じて各軸
方向の分割数が定められ、重心座標系列にほぼ均等に対
応した各軸方向の分割座標系列が決定される。決定され
た分割座標系列で分割される前記外接枠の各領域の長さ
を対応する軸方向の外接枠の長さで正規化した各軸方向
の分割領域長の系列(分割領域長系列)が計算手段によ
り計算される。この正規化した分割領域長系列と予め同
様にして計算された標準パターンの分割領域長系列とが
認識手段により照合され、該当する標準パターンのカテ
ゴリ名を文字図形名として出力される。このように本発
明では、文字図形のパターンを走査して文字図形の特徴
情報として分割領域長系列を算出し、この特徴情報で文
字図形を認識しているので、簡単な処理で高速な文字認
識が可能となる。
(Operation) According to the present invention, since the character recognition method is configured as described above, the technical means operates as follows. By scanning the pattern stored in the storage means, the circumscribing frame (character frame) of the character graphic is detected by the first detecting means, and the creating means detects the circumscribing frame in each axial direction (for example, X-axis, Y-axis direction). A black bit number distribution is created. Based on the circumscribing frame and black bit number distribution thus obtained, the barycentric coordinate series in each axial direction is detected by the second detecting means. Based on the detected barycentric coordinate series, the determining means determines the divided coordinate series in each axial direction. For example, the number of divisions in each axial direction is determined according to the complexity of the character graphic, and the divisional coordinate series in each axial direction that substantially evenly corresponds to the barycentric coordinate series is determined. A series of divided area lengths in each axial direction (divided area length series) obtained by normalizing the length of each area of the circumscribed frame divided by the determined divided coordinate series by the length of the corresponding circumscribed frame in the axial direction It is calculated by the calculation means. The normalized divided area length series and the standard area divided area length series calculated in advance are collated by the recognizing means, and the category name of the corresponding standard pattern is output as a character graphic name. As described above, according to the present invention, a pattern of a character graphic is scanned to calculate a divided area length series as characteristic information of the character graphic, and the character graphic is recognized by this characteristic information. Is possible.

(実施例) 以下、第1図乃至第4図を参照して本発明の実施例を説
明する。
(Embodiment) An embodiment of the present invention will be described below with reference to FIGS. 1 to 4.

第1図は本発明の文字図形認識方式の一実施例を示す機
能ブロツク図である。
FIG. 1 is a functional block diagram showing an embodiment of a character / graphics recognition system of the present invention.

1は文字、図形、記号等(以下、文字という)が記載さ
れた帳票等の媒体からの光入力である。この光入力1は
光電変換部2に入力される。光電変換部2は1つの文字
予定領域を128×128の画素へ分解し、各画素を2値のデ
イジタル信号(以下これを入力文字パターンと呼ぶ)へ
変換するものであり、平均的大きさの1文字は60×60ビ
ツト程度の入力文字パターンで表現される。パターンレ
ジスタ3は文字予定領域における各画素のX,Y座標を再
現できる形式で入力文字パターンを記憶するものであ
り、文字予定領域に対応して128×128ビツトの容量を有
するものである。
Reference numeral 1 is an optical input from a medium such as a form in which characters, figures, symbols and the like (hereinafter referred to as characters) are described. The optical input 1 is input to the photoelectric conversion unit 2. The photoelectric conversion unit 2 decomposes one character planned area into 128 × 128 pixels and converts each pixel into a binary digital signal (hereinafter referred to as an input character pattern), and has an average size. One character is represented by an input character pattern of about 60 x 60 bits. The pattern register 3 stores the input character pattern in a format capable of reproducing the X and Y coordinates of each pixel in the character planned area, and has a capacity of 128 × 128 bits corresponding to the character planned area.

文字枠検出部4は、例えば文字の外接枠(文字枠)をそ
のパターンレジスタにおける左端座標Xl、右端座標Xr
上端座標Yt、下端座標Ybで表現して検出する。
The character frame detection unit 4, for example, defines a circumscribing frame (character frame) of a character as the left end coordinate X l , the right end coordinate X r ,
The upper limit coordinate Y t and the lower limit coordinate Y b are used for detection.

文字投影作成部5はパターンレジスタ3の入力文字パタ
ーンを所定の軸、例えばX軸,Y軸(夫夫パターンレジス
タ3の2次元座標における水平方向,垂直方向)へ投影
して黒ビツト数の分布を求め、黒ビツト数分布SX
(x),SY(y)を作成する。
The character projection creating unit 5 projects the input character pattern of the pattern register 3 on a predetermined axis, for example, the X axis and the Y axis (horizontal direction and vertical direction in the two-dimensional coordinates of the husband and wife pattern register 3) to distribute the number of black bits. , The black bit number distribution SX
Create (x) and SY (y).

但し、x,yはパターンレジスタ3における夫夫0〜127な
る2次元座標であり、Yt,Ybは文字枠のY軸方向の上端
座標、下端座標、Xl,XrはX軸方向の左端座標、右端座
標であり、P(x,y)は黒ビツト又は白ビツトを意味
し、黒ビツト(有意色)の場合P(x,y)=1、白ビツ
ト(背景色)の場合P(x,y)=0をとる。
Here, x and y are two-dimensional coordinates 0 to 127 in the pattern register 3, Y t and Y b are upper and lower coordinates in the Y axis direction of the character frame, and X l and X r are X axis directions. Is the left and right coordinates of P, and P (x, y) means black bit or white bit. In case of black bit (significant color), in case of P (x, y) = 1, white bit (background color) Take P (x, y) = 0.

第2図(a)に入力文字パターン例として漢字「天」と
「夫」のパターンの場合を示し、第2図(b),(c)
に第2図(a)の各パターンに対する黒ビツト数分布SX
(x),SY(y)を示す。
FIG. 2 (a) shows an example of the input character pattern in the case of the Chinese characters "ten" and "husband", and FIG. 2 (b), (c).
The black bit number distribution SX for each pattern in Fig. 2 (a)
(X) and SY (y) are shown.

重心検出部6は、文字枠のX,Y各軸方向の全範囲Xl〜Xr,
Yt〜Yb及び前の過程で検出した重心座標でその範囲Xl
Yr,Yt〜Ybを分割した各範囲を対象として、入力文字パ
ターンの夫々の黒ビツト数分布SX(x),SY(y)の重
心座標系列X(Mp),Y(Mq)を求めるものであり、各範
囲の1次モーメントの和をその範囲の黒ビツト和で除算
することによつて求めるものである。但し、Mp,Mqは座
標値の大きさの順に付した重心座標番号であり、Mp=1
〜MX(MXはX軸方向の重心の個数であつて奇数)Mq=1
〜MY(MYはY軸方向の重心の個数)である。X軸方向の
重心座標の個数MXとしては、15個程度の比較的多い数
(分割数に比べて)を採用することが望ましいが、説明
の簡略化のために7個の重心座標X(Mp)を検出する場
合について述べる。
The center-of-gravity detection unit 6 determines the entire range X l to X r , in the X and Y axis directions of the character frame.
Y t ~ Y b and its range X l ~ in the barycentric coordinates detected in the previous process
Targeting each range obtained by dividing Y r , Y t to Y b , the barycentric coordinate series X (M p ), Y (M q of the black bit number distributions SX (x) and SY (y) of the input character pattern ) Is obtained by dividing the sum of the first moments in each range by the black bit sum in that range. However, M p and M q are barycentric coordinate numbers given in order of magnitude of coordinate values, and M p = 1
~ MX (MX is the odd number of the center of gravity in the X-axis direction) M q = 1
To MY (MY is the number of centers of gravity in the Y-axis direction). As the number MX of the barycentric coordinates in the X-axis direction, it is desirable to use a relatively large number (about the number of divisions) of about 15, but for the sake of simplification of explanation, 7 barycentric coordinates X (M The case of detecting p ) will be described.

まず、文字枠のX軸方向の範囲Xl〜Xrを対象として、次
式に示すように入力文字パターンの黒ビツト数分布SX
(x)の1次モーメント和をその範囲の黒ビツト和で除
算することによつて、中央の重心座標番号M4の重心座標
X(4)を求め 次いで、その重心座標X(4)で分割された夫夫の範
囲、Xl〜X(4),X(X4)〜Xrを対象として2つの重心
座標X(2),X(6)を求める。
First, for the range X l to X r in the X-axis direction of the character frame, the black bit number distribution SX of the input character pattern is calculated as shown in the following equation.
The centroid coordinate X (4) of the center centroid coordinate number M 4 is obtained by dividing the first moment sum of (x) by the black bit sum in that range. Then, the barycentric coordinates X divided range severally in (4), X l ~X ( 4), X (X 4) a to X r as a target two barycentric coordinates X (2), X (6) Ask for.

次いで、これまで検出された重心座標X(2),X
(4),X(6)で分割された範囲Xl〜X(2),X(2)
〜X(4),X(4)〜X(6),X(6)〜Xrを対象とし
て4個の重心座標X(1),X(3),X(5),X(7)を
求める。
Next, the barycentric coordinates X (2), X detected so far
Range X l to X (2), X (2) divided by (4), X (6)
~X (4), X (4 ) ~X (6), X (6) 4 pieces of the center of gravity to X r as object coordinates X (1), X (3 ), X (5), X (7) Ask for.

Y軸方向の重心座標Y(Mq)の検出も検出する重心座標
個数MYを7個とした場合、まず、文字枠の範囲Yt〜Yb
対象として入力文字パターンの黒ビツト分布SY(y)の
重心座標Y(4)を検出し、次いで文字枠を重心座標で
2分した範囲Yt〜Y(4),Y(4)〜Ybそれぞれを対象
として黒ビツト分布SY(y)重心座標Y(2),Y(4)
を検出し、更にこれまでに検出された重心座標Y
(2),Y(4),Y(6)でY軸方向の文字枠を分割した
夫々の範囲Yt〜Y(2),Y(2)〜Y(4),Y(4)〜
Y(6),Y(6)〜Ybを対象として黒ビツト分布SY
(y)の重心座標を検出することによつて、計7個の重
心座標Y(1)〜Y(7)を検出する。
If the number of barycentric coordinates MY, which also detects the barycentric coordinate Y (M q ) in the Y-axis direction, is set to 7, first, the black bit distribution SY (of the input character pattern is targeted for the range Y t to Y b of the character frame). detecting the center of gravity coordinates Y of y) (4), then the range Y t for 2 min character frame center of gravity coordinates ~Y (4), Y (4 ) ~Y b black respectively as the target bit distribution SY (y) Center of gravity coordinates Y (2), Y (4)
Is detected, and the barycentric coordinate Y detected so far
(2), Y (4) , Y (6) range and each divided character frame in the Y-axis direction by Y t ~Y (2), Y (2) ~Y (4), Y (4) ~
Black bit distribution SY for Y (6), Y (6) to Y b
By detecting the barycentric coordinates of (y), a total of seven barycentric coordinates Y (1) to Y (7) are detected.

漢字「天」と「夫」の入力文字パターン(第2図
(a))の場合については、第2図(b),(c)の黒
ビツト数分布(SX(x),SY(y))図中に重心座標X
(1)〜X(7)、Y(1)〜Y(7)を示す。
In the case of the input character patterns of the Chinese characters “ten” and “husband” (FIG. 2 (a)), the black bit number distributions (SX (x), SY (y) in FIGS. 2 (b) and (c) are shown. ) Center of gravity X
(1) to X (7) and Y (1) to Y (7) are shown.

文字枠分割点決定部7は、X,Y軸方向の分割数をNX,NYと
し、重心検出部6よりうけたX,Y軸各方向の分割座標系
列をDX(ki),DY(kj)として、X,Y軸各方向の重心座標
系列X(Mp),Y(Mq)を分割座標候補として、重心座標
番号Mp,Mqを分割座標番号ki,kjにほぼ均等に対応づけて
分割座標系列DX(ki),DY(kj)を決定するものであ
る。
The character frame division point determination unit 7 sets the number of divisions in the X and Y axis directions to NX and NY, and sets the divided coordinate series in each of the X and Y axis directions received from the center of gravity detection unit 6 to DX (k i ), DY (k j ), the barycentric coordinate series X (M p ), Y (M q ) in each direction of the X and Y axes are used as the divisional coordinate candidates, and the barycentric coordinate numbers M p , M q are set to the divisional coordinate numbers k i , k j . The divided coordinate series DX (k i ) and DY (k j ) are determined by being associated with each other evenly.

この実施例における分割単位領域の分割形式は、X軸方
向に関する分割数としてNX=4,5,6,8なる4形式を取る
ことができ、同様にY軸方向に関する分割数NYとしてNY
=4,5,6,8なる4形式を取ることができ、X軸方向の分
割座標番号をki(ki=1〜NX-1,NX=4,5,6,8)とし且つ
Y軸方向の分割座標番号をkj(kj=1〜NY-1,NY=4,5,
6,8)として、文字枠を夫々NX,NYなる個数の領域に分割
する分割座標系列DX(ki),DY(kj)を決定する。X,Y軸
各方向の重心座標番号Mp,MqとX,Y軸方向の分割座標番号
ki,kjをほぼ均等に対応づけて分割座標列DX(ki),DY
(kj)を決定するために用いるテーブルを第1表に示
す。
The division format of the division unit area in this embodiment can take four formats as NX = 4,5,6,8 as the division number in the X-axis direction, and similarly, the division number NY in the Y-axis direction is NY.
= 4,5,6,8, and the divided coordinate numbers in the X-axis direction are k i (k i = 1 to NX-1, NX = 4,5,6,8) and Y The division coordinate numbers in the axial direction are k j (k j = 1 to NY-1, NY = 4,5,
6, 8), the divided coordinate series DX (k i ) and DY (k j ) for dividing the character frame into the regions NX and NY, respectively, are determined. Center of gravity coordinate number in each direction of X, Y axis M p , M q and divided coordinate number in X, Y axis direction
Divided coordinate sequence DX (k i ), DY with almost equal correspondence between k i and k j
The table used to determine (k j ) is shown in Table 1.

このテーブルを参照して、X,Y軸各方向の分割数NX,NYに
対応してこのテーブルから重心座標番号Mp,Mqを読み出
し、その重心座標番号Mp,Mqに対応した重心座標X
(Mp),Y(Mq)を分割座標DX(ki),DY(kj)として決
定する。
Referring to this table, X, division number NX of Y-axis in each direction, the centroid coordinate number M p from the table in response to NY, reads the M q, centroid corresponding to the barycentric coordinate number M p, M q Coordinate X
(M p ) and Y (M q ) are determined as the division coordinates DX (k i ) and DY (k j ).

第1表のテーブルは、重心検出部6で検出する重心座標
の個数MX,MYが7個の場合であるが、一般的な場合にお
いても、X,Y各方向の分割数の重心座標が含まれるよう
に対応させ、且つその際余分の重心座標が残つた場合は
両端の領域から順に1個多い重心座標が含まれるように
対応させることによつて作ることができる。
The table in Table 1 shows a case where the number of barycentric coordinates MX and MY detected by the barycentric detecting unit 6 is 7, but in a general case, the barycentric coordinates of the number of divisions in each of the X and Y directions are included. It is possible to make it by making correspondence so as to include one more barycentric coordinate in order from the regions at both ends when extra barycentric coordinates remain.

第3図には、X,Y軸各方向の分割数NX,NYとしてNX=NY=
5なる分割数が指定された場合について、分割座標系列
DX(ki),DY(kj)と重心座標系列X(Mp),Y(Mq)と
の対応関係を示す。
In Fig. 3, NX = NY = as the number of divisions NX, NY in each direction of the X and Y axes.
When a division number of 5 is specified, the division coordinate series
The correspondence between DX (k i ), DY (k j ) and the barycentric coordinate series X (M p ), Y (M q ) is shown.

なお、分割数NX,NYは入力文字の複雑度に応じて分割数N
X,NYを決定し、或いはいつたんリジエクトされた場合に
分割数NX,NYを変更して再度文字認識を行なわせるもの
である。
The number of divisions NX, NY is N depending on the complexity of the input characters.
It decides X, NY, or changes the number of divisions NX, NY when it is rejected, and makes character recognition again.

以上の様に文字枠分割点決定部7では、分割単位領域の
分割形式は、X軸方向に関する分割数としてNX=4,5,6,
8なる4形式、Y軸方向に関する分割数としてNY=4,5,
6,8なる4形式をとることができる。本実施例では説明
の簡略化のため、分割数をNX=NY=4として以下説明す
る。この場合、X軸方向については、重心座標X
(M2),X(M4),X(M6)に対応する分割座標DX(1),D
X(2),DX(3)、Y軸方向については、重心座標Y
(M2),Y(M4),Y(M6)に対応する分割座標DY(1),D
Y(2),DY(3)を決定する。
As described above, in the character frame division point determination unit 7, the division format of the division unit area is NX = 4,5,6, as the number of divisions in the X-axis direction.
8 formats, the number of divisions in the Y-axis direction is NY = 4,5,
It can take four formats: 6,8. In the present embodiment, the number of divisions will be described below with NX = NY = 4 for simplification of description. In this case, the barycentric coordinate X in the X-axis direction
Division coordinates DX (1), D corresponding to (M 2 ), X (M 4 ), X (M 6 ).
About X (2), DX (3), Y axis direction, barycentric coordinate Y
Division coordinates DY (1), D corresponding to (M 2 ), Y (M 4 ), Y (M 6 ).
Determine Y (2) and DY (3).

正規化分割領域長計算部8は、文字枠検出部4で検出さ
れたX方向の文字枠座標Xl,XrとY軸方向の文字枠座標Y
t,Yb、及び文字枠分割点決定部7で決定されたX軸方向
の分割座標DX(1)、DX(2)、DX(3)とY軸方向の
分割座標DY(1)、DY(2)、DY(3)を受けて、各軸
上において各分割座標で分割される各領域の長さを、上
記両端座標間の長さで正規化した正規化分割領域長系列
を以下の式によつて計算する。
The normalized divided area length calculation unit 8 calculates the character frame coordinates X l , X r in the X direction detected by the character frame detection unit 4 and the character frame coordinate Y in the Y axis direction.
t , Y b , and the division coordinates DX (1), DX (2), DX (3) in the X-axis direction and the division coordinates DY (1), DY in the Y-axis direction determined by the character frame division point determination unit 7. (2), receiving DY (3), the length of each area divided by each divided coordinate on each axis is normalized by the length between the above-mentioned both ends coordinates Calculate by formula.

X軸正規化分割座標系列; X軸両端座標間長;LX=Xr−Xl+1 ……(7) Y軸正規化分割座標系列; Y軸両端座標間長;LY=Yb−Yt+1 ……(9) ただし、DX(0)=Xl、DX(4)=Xr、DY(0)=Yt
DY(4)=Ybである。
X-axis normalized division coordinate series; Length between both X-axis coordinates; LX = X r −X l +1 (7) Y-axis normalized division coordinate series; Length between Y-axis coordinates; LY = Y b −Y t +1 (9) However, DX (0) = X l , DX (4) = X r , DY (0) = Y t ,
DY (4) = Y b .

漢字「天」と「夫」夫々の入力文字パターンにおける分
割座標DX(0)〜DX(4)、DY(0)〜DY(4)、X軸
方向の正規化分割領域長BEX(1)〜BEX(4)、Y軸方
向の正規化分割領域長BEY(1)〜BEY(4)を各入力文
字パターンと共に第2図(a)に示す。
Division coordinates DX (0) to DX (4), DY (0) to DY (4) in the input character patterns of the Chinese characters “ten” and “husband”, normalized division area length BEX (1) to X axis direction FIG. 2A shows BEX (4) and the normalized divided area lengths BEY (1) to BEY (4) in the Y-axis direction together with the respective input character patterns.

正規化分割領域長計算部8で得られた入力文字パターン
の特徴情報としての正規化分割領域系列fi={BEX
(I)、BEY(I)|I=1〜4}は識別部9に与えられ
る。
Normalized divided area sequence f i = {BEX as feature information of the input character pattern obtained by the normalized divided area length calculation unit 8
(I) and BEY (I) | I = 1 to 4} are given to the identification unit 9.

辞書メモリ10には、入力文字パターンの場合と同様にし
て計算された標準パターンに対する特徴情報としての正
規化分割領域系列giが予め登録されている。
In the dictionary memory 10, the normalized divided area series g i as the characteristic information for the standard pattern calculated in the same manner as for the input character pattern is registered in advance.

識別部9は入力文字パターン及び標準パターンの特徴情
報の類似度を測定し、最も類似する標準パターンの文字
コードを入力文字パターン名として認識し、その文字コ
ードを文字コード出力端子11に出力する。本実施例で
は、辞書メモリ10内の標準パターンの正規化分割領域長
系列giと入力文字パターンの正規化分割領域長系列fi
の間における下記のユークリツド距離(D)の最小値を
与える標準パターンを最も類似する標準パターンとす
る。
The identification unit 9 measures the degree of similarity between the input character pattern and the characteristic information of the standard pattern, recognizes the character code of the most similar standard pattern as the input character pattern name, and outputs the character code to the character code output terminal 11. In this embodiment, the following minimum value of the Euclidean distance (D) between the normalized divided area length series g i of the standard pattern in the dictionary memory 10 and the normalized divided area length series f i of the input character pattern is given. Let the standard pattern be the most similar standard pattern.

なおまた、前記実施例においてはテーブルを採用するこ
とによつて重心座標と分割座標とを対応づけたが、所定
の手順のフローチヤートの演算処理を実行させることに
よつても対応づけることができる。この場合のフローチ
ヤートを第4図に示す。なお、第4図における除算の結
果はすべて小数点以下切り捨てである。
Further, in the above-described embodiment, the barycentric coordinates and the divided coordinates are associated with each other by using the table, but they can be associated with each other by executing the flow chart arithmetic processing of a predetermined procedure. . The flow chart in this case is shown in FIG. The results of the division in FIG. 4 are all rounded down to the right of the decimal point.

第4図において、ステツプS1で重心個数MXを分割数NXで
割つた数Mαを求め、ステツプS2,S3でMX/NXの剰余R1
そのR1を2で割った商R2を求める。又、ステツプS4で分
割数の中央値kαを求め、ステツプS5,S6で分割番号ki
重心番号Mpを0にセツトする。又、ステツプS7,S8,S9
で、分割番号kiを1つ増加する毎に、前に、設定されて
いる商R2を1つ減じ、重心番号MpをMαずつ増加させ
る。ステツプS10で商R2が負でないことを調べ、商R2
負でない限りステツプS11で重心番号Mrの数を1つ増
し、ステツプS12でその重心番号Mpを分割番号kiに対応
づけ、分割座標DX(ki)を決定する。商R2が負の場合、
ステツプS13で現在の分割番号kiが中央値kαより大きい
か否かを判定し、大きい場合は重心番号Mrを1つ増した
ものを、小さい場合はステツプS9で設定された重心番号
Mrを、ステツプS12において分割番号kiに対応づけ、分
割座標DX(ki)を決定し、ステツプS14で分割番号ki
(NX-1)に一致したことを検出して終了する。
The In Figure 4, obtains a divided ivy number M alpha gravity number MX by dividing the number NX at step S1, obtaining a quotient R 2 obtained by dividing the remainder R 1 and its R 1 of MX / NX 2 at step S2, S3 . Further, the number of divisions calculated median k alpha in step S4, the excisional division number k i and the center of gravity number M p to 0 in step S5, S6. Also, steps S7, S8, S9
Every time the division number k i is incremented by 1, the set quotient R 2 is decremented by 1 and the center of gravity number M p is incremented by M α . It is checked in step S10 that the quotient R 2 is not negative. If the quotient R 2 is not negative, the number of the center of gravity number M r is incremented by 1 in step S11, and the center of gravity number M p is associated with the division number k i in step S12. , The division coordinate DX (k i ) is determined. If the quotient R 2 is negative,
In step S13, it is determined whether or not the current division number k i is larger than the median k α. If it is larger, the center of gravity number M r is incremented by 1, and if it is smaller, the center of gravity number set in step S9 is set.
M r is associated with the division number k i in step S12, the division coordinate DX (k i ) is determined, and it is detected in step S14 that the division number k i matches (NX-1), and the process ends.

以上述べた本実施例の文字図形認識方式の特徴情報であ
る分割領域長系列の有効性を以下に説明する。
The effectiveness of the divided area length series, which is the characteristic information of the character / graphics recognition method of the present embodiment described above, will be described below.

例えば第2図(a)にそれぞれ示される「天」と「夫」
の入力文字パターンにおいて、両パターンの相異点とな
る中央縦ストロークの上部の突き出しの有無がBEY
(1)において顕著な差となつてあらわれ、正規化分割
領域長系列が文字図形パターンの差異を有効に反映して
いることが明らかである。
For example, "heaven" and "husband" shown in Fig. 2 (a), respectively.
In the input character pattern of BEY, whether there is a protrusion at the top of the central vertical stroke, which is the difference between both patterns, is BEY.
In (1), a noticeable difference appears, and it is clear that the normalized divided area length series effectively reflects the difference in the character / graphic pattern.

以上のように本実施例によれば、入力文字パターンの走
査と所定の演算によって得られる正規化分割領域長系列
を特徴情報としたので簡単な処理の高速な文字図形認識
を実現することができる。
As described above, according to the present embodiment, since the normalized divided area length series obtained by the scanning of the input character pattern and the predetermined calculation is used as the characteristic information, it is possible to realize the high-speed character and graphic recognition with a simple process. .

(発明の効果) 以上詳細に説明したように本発明によれば、従来の認識
方式の特徴情報抽出における、輪郭追跡や細線化等の複
雑なパターン処理を行なうことなく、入力文字図形パタ
ーンを走査するだけで得られる所定の軸上における黒ビ
ツト数分布から、重心を利用して特徴情報である正規化
した分割領域長系列を得ているので、簡単な処理の高速
な文字認識が実現できる。
(Effects of the Invention) As described in detail above, according to the present invention, an input character / graphic pattern is scanned without performing complicated pattern processing such as contour tracking and thinning in the feature information extraction of the conventional recognition method. From the black bit number distribution on the predetermined axis obtained by simply performing the above, since the normalized divided region length sequence, which is the characteristic information, is obtained by using the center of gravity, it is possible to realize simple processing and high-speed character recognition.

【図面の簡単な説明】[Brief description of drawings]

第1図は本発明による文字図形認識方式の一実施例を示
す機能ブロツク図、第2図は入力文字パターン例と、重
心座標系列、分割座標系列、正規化分割領域長系列との
関係を示す図、第3図は重心座標系列と分割座標系列と
の対応関係を示す図、第4図は分割座標系列の他の決定
方法を示すフローチヤートである。 1……光入力、2……光電変換部、3……パターンレジ
スタ、4……文字枠検出部、5……文字投影作成部、6
……重心検出部、7……文字枠分割点決定部、8……正
規化分割領域長計算部、9……識別部、10……辞書メモ
リ、11……出力端子
FIG. 1 is a functional block diagram showing an embodiment of a character / figure recognition system according to the present invention, and FIG. 2 shows the relationship between an input character pattern example and a barycentric coordinate series, divided coordinate series, and normalized divided area length series. FIGS. 3A and 3B are diagrams showing the correspondence relationship between the barycentric coordinate series and the divided coordinate series, and FIG. 4 is a flow chart showing another method of determining the divided coordinate series. 1 ... Optical input, 2 ... Photoelectric conversion unit, 3 ... Pattern register, 4 ... Character frame detection unit, 5 ... Character projection creation unit, 6
...... Center of gravity detection unit, 7 ... Character frame division point determination unit, 8 ... Normalized division area length calculation unit, 9 ... Identification unit, 10 ... Dictionary memory, 11 ... Output terminal

フロントページの続き (72)発明者 山下 義征 東京都港区虎ノ門1丁目7番12号 沖電気 工業株式会社内 (56)参考文献 特開 昭58−123171(JP,A) 特開 昭61−150086(JP,A)Front page continued (72) Inventor Yoshiyuki Yamashita 1-7-12 Toranomon, Minato-ku, Tokyo Oki Electric Industry Co., Ltd. (56) Reference JP 58-123171 (JP, A) JP 61- 150086 (JP, A)

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】媒体上の文字図形を読取って量子化して得
られるパターンを記憶する記憶手段を備え、前記パター
ンに基づいて、文字図形を認識する文字図形認識方式に
おいて、 (a)前記パターンを走査して文字図形の外接枠を検出
する第1の検出手段、 (b)前記パターンを走査して所定の2つの軸に投影し
た各軸方向の黒ビット数分布を作成する作成手段、 (c)前記2つの軸方向の前記外接枠内の範囲で各黒ビ
ット数分布の重心座標を決定し、決定した各重心座標で
外接枠内の範囲を分割した各分割範囲に対し各黒ビット
数分布の重心座標を決定する過程を繰り返して各軸方向
の重心座標系列を検出する第2の検出手段、 (d)前記重心座標系列に基づいて各軸方向の分割座標
系列を決定する決定手段、 (e)前記分割座標系列で分割される前記外接枠の各領
域の長さを対応する軸方向の外接枠の長さで正規化した
各軸方向の分割領域長系列を計算する計算手段、 (f)前記分割領域系列と予め計算された標準パターン
の分割領域長系列とを照合して前記パターンの文字図形
を認識する認識手段とを具備することを特徴とする文字
図形認識方式。
1. A character / figure recognition system for recognizing a character / figure based on the pattern, comprising a storage unit for storing a pattern obtained by reading and quantizing the character / figure on a medium. First detecting means for scanning to detect a circumscribing frame of a character figure; (b) creating means for scanning the pattern to create black bit number distribution in each axial direction projected onto two predetermined axes; ) The barycentric coordinates of each black bit number distribution are determined within the range of the circumscribing frame in the two axial directions, and the black bit number distribution is obtained for each of the divided ranges obtained by dividing the range within the circumscribing frame with each determined barycentric coordinate Second detecting means for detecting the barycentric coordinate series in each axial direction by repeating the process of determining the barycentric coordinates of (d) determining means for determining the divided coordinate series in each axial direction based on the barycentric coordinate series, e) In the divided coordinate series Calculating means for calculating a divided area length series in each axial direction in which the length of each area of the circumscribed frame to be divided is normalized by the length of the corresponding circumscribed frame in the axial direction, (f) the divided area series and the divided area series in advance A character / figure recognition method, comprising: a recognition unit that recognizes a character / figure of the pattern by collating the calculated divided area length series of the standard pattern.
JP62061241A 1987-03-18 1987-03-18 Character figure recognition method Expired - Fee Related JPH0799534B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP62061241A JPH0799534B2 (en) 1987-03-18 1987-03-18 Character figure recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP62061241A JPH0799534B2 (en) 1987-03-18 1987-03-18 Character figure recognition method

Publications (2)

Publication Number Publication Date
JPS63228390A JPS63228390A (en) 1988-09-22
JPH0799534B2 true JPH0799534B2 (en) 1995-10-25

Family

ID=13165536

Family Applications (1)

Application Number Title Priority Date Filing Date
JP62061241A Expired - Fee Related JPH0799534B2 (en) 1987-03-18 1987-03-18 Character figure recognition method

Country Status (1)

Country Link
JP (1) JPH0799534B2 (en)

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS58123171A (en) * 1982-01-18 1983-07-22 Oki Electric Ind Co Ltd Character recognizing system

Also Published As

Publication number Publication date
JPS63228390A (en) 1988-09-22

Similar Documents

Publication Publication Date Title
CN111275049A (en) Method and device for acquiring character image skeleton feature descriptors
CN113537216B (en) Dot matrix font text line inclination correction method and device
JPH0799534B2 (en) Character figure recognition method
JPH0656625B2 (en) Feature extraction method
JPH0799535B2 (en) Character figure recognition method
JPH0799536B2 (en) Character figure recognition method
JPS6214277A (en) Image processing method
JP3095470B2 (en) Character recognition device
JP2785747B2 (en) Character reader
JP2616994B2 (en) Feature extraction device
JPH0656624B2 (en) Feature extraction method
JPH0664629B2 (en) Character recognition method
JP2576491B2 (en) Feature extraction method
JPH0877293A (en) Character recognition apparatus and method for creating dictionary for character recognition
JP2576494B2 (en) Feature extraction method
JPH0147835B2 (en)
JPH0147829B2 (en)
JPH09114990A (en) Image recognition method
JPH0646418B2 (en) Feature extraction method
JPH01187684A (en) Character recognizing device
JPH0580711B2 (en)
JPH0632080B2 (en) Character recognition method
JPH0438024B2 (en)
JPS5837780A (en) Character recognizing method
JPH0226267B2 (en)

Legal Events

Date Code Title Description
LAPS Cancellation because of no payment of annual fees