JPS5851309B2

JPS5851309B2 - Character selection processing method

Info

Publication number: JPS5851309B2
Application number: JP54049550A
Authority: JP
Inventors: 康宏山田; 和昭小森; 俊吉多田; 敏夫堤田; 行恭飯田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 1979-04-21
Filing date: 1979-04-21
Publication date: 1983-11-15
Also published as: JPS55140976A

Description

【発明の詳細な説明】本発明は、文字選択処理方式、特に複数文字種グループ
が混在するものを認識する文字認識装置における後処理
部による文字選択処理方式に関するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a character selection processing method, and more particularly to a character selection processing method by a post-processing unit in a character recognition device that recognizes a combination of a plurality of character type groups.

従来の複数文字種グループを読み取る文字認識装置では
、文字種グループ別に用意した独立の認識論理をフォー
マット制御により切替え、文字種に応じてフィールドを
分ける方法が一般的であった。In conventional character recognition devices that read multiple character groups, a common method was to use format control to switch between separate recognition logics prepared for each character group, and to separate fields according to the character type.

すなわち、（：）特開昭５２−１００９３６号公報に示
されるごとき、帳票上の各フィールドの文字数、字種等
のフォーマット情報を予め装置内に記憶させ、読取時に
は個々の文字の属すフィールド位置を確認しつつ認識論
理に切替えるもの、また（１ｉ）特開昭５３−１８３４
５号公報に示されるごとき、帳票上に記述する際、各フ
ィールドの先頭へ文字種グループ別に定めた特定の記号
を付加し、読取時には前記の特定の記号を先ず読み取り
、その記号に対応して認識論理を切替え、次の記号を読
み取るまで同一フィールドと見なすもの等がある。That is, as shown in (:) Japanese Patent Application Laid-open No. 52-100936, format information such as the number of characters and character types of each field on a form is stored in advance in the device, and when reading, the field position to which each character belongs is stored. One that switches to recognition logic while confirming, and (1i) Japanese Patent Application Laid-open No. 53-1834
As shown in Publication No. 5, when writing on a form, a specific symbol determined for each character type group is added to the beginning of each field, and when reading, the specific symbol is first read and recognized according to that symbol. There are some that switch the logic and consider the field to be the same until the next symbol is read.

しかし、これらはいずれも、各記入欄について予め記述
すべき文字種の決まったものであり、カナ文字と数字が
混在する住所欄や、カナ文字、英字、数字等が混在する
品名欄等にはそのままでは使用できなかった。However, these are all predetermined types of characters that must be written in each entry field, and are used as is for address fields where kana characters and numbers are mixed, product name fields where kana characters, alphabets, numbers, etc. are mixed. could not be used.

このため、住所や品名の一部をコード化し数字で記入す
る等、筆記者の負担を増加させる欠点があった。For this reason, there was a drawback that it increased the burden on the scribe, such as having to code part of the address and product name and write it in numbers.

これとは別に、（ｉｉｌ）複数の文字種グループを１グ
ループとし、単一の文字種グループと同一に扱うことに
よって読み取る方法もある。Apart from this, there is also a method (iii) of reading a plurality of character type groups into one group and treating the same as a single character type group.

これは個々の文字の形以外の情報を用いずに混在フィー
ルド（例えば、菓子、数字、カナ文字を同一フィールド
とする）を読み取るものだが、互いによく似た文字（例
えば″７″と“り“、′５“と“Ｓ“等）を分離するの
に筆記者へその差を強調して書く訓練をする等の筆記者
への心理的負担が増えたり、認識論理の規模が著しく大
きくなるという欠点があった。This method reads mixed fields (for example, confectionery, numbers, and kana characters in the same field) without using information other than the shape of the individual characters, but it reads characters that are very similar to each other (for example, "7" and "ri"). , '5' and 'S', etc.), the psychological burden on scribes, such as having to train them to write emphasizing the differences, increases, and the scale of their cognitive logic becomes significantly larger. There were drawbacks.

また、（ＩＶＩ特願昭５２−８６０２５に見られるごと
き読み取られた文字の前後関係を調べ、もし文法的に許
容されない組合せがあれば、どちらかの文字を誤読した
と見なし、一方または両方の文字をリジェクトする等の
後処理を施す方法もあるが、これは“濁点“、９半濁点
“等の極めて限られたカテゴリにのみ有効であり、一般
性がない。In addition, the context of the read characters as seen in (IVI Patent Application 1986-86025) is checked, and if there is a grammatically unacceptable combination, it is assumed that one of the characters has been misread, and one or both characters are There is also a method of performing post-processing such as rejecting the ``Dakuten'', but this is only effective for very limited categories such as ``Dakuten'' and 9 Handakuten'', and is not general.

本発明は、これらの欠点を除去したもので、殺性のある
文脈処理を実行するための文字選択処理方式を提供する
ことにより筆記者への心理的負担を増加させることなく
、英字、数字、カナ文字等が混在する文字を読み取る文
字認識装置を実現することを目的としている。The present invention eliminates these shortcomings and provides a character selection processing method for performing lethal context processing, thereby eliminating the need to increase the psychological burden on the scribe. The aim is to realize a character recognition device that can read characters that include kana characters.

本発明の他の目的は複数文字種グループの混在読取を行
なう文字認識装置において、専用フィールド用の小規模
な認識論理を複数用いることにより高識別率を実現する
ことにある。Another object of the present invention is to achieve a high recognition rate by using a plurality of small-scale recognition logics for dedicated fields in a character recognition device that reads a plurality of character type groups in a mixed manner.

その概要は、認識論理に含まれる各サブカテゴリに優先
度付けを施しておき、これをもとに、認識部から得た候
補カテゴリを取捨選択すること、また、文法規則および
読取帳票特有の知識を活用し、文字種やカテゴリ名の前
後関係から有力な候補サブカテゴリを改めて取捨選択す
ることを特徴としている。The outline is to prioritize each subcategory included in the recognition logic, and based on this, select candidate categories obtained from the recognition unit, and to apply knowledge specific to grammar rules and reading forms. It is characterized by the fact that it utilizes this method to re-select promising candidate subcategories based on the character type and context of the category name.

第１図は本発明の実施例であって、１は文字認識部、２
は優先度による候補カテゴリ選択手段、３は個々の文字
についてその候補カテゴリ名や優先度を記憶する候補メ
モリ、４は文字ポインタ、５は字種制御手段であって候
補メモリ３の内容を基に候補カテゴリを決定するもの、
１２は候補カテゴリ名と優先度と文字種コードとの３種
の情報を文字ごとに記憶する結果メモリ、６は直前の文
字の文字種コードを基に改めて半該文字の候補カテゴリ
を決定する順方向サブカテゴリ処理部、１３は隣接２文
字に関する前後関係の許容・禁止情報を格納した知識テ
ーブル、１は順方向知識処理部であって当該文字と直前
の文字との前後関係を知識チーフル１３によってチェッ
クし前者の候補カテゴリを正すもの、８は当該文字の文
字種コードをもとに改めて直前の文字の候補カテゴリを
決定する逆方向サブカテゴリ処理部、９は逆方向知識処
理部であって当該文字と直前の文字との前後関係を知識
テーブル１３によってチェックし後者の候補カテゴリを
正すもの、１０は当該文字がフィールドの終端か否かを
判定するフィールド・エンド検出部、１１は結果出力部
であって結果メモリ１２の内容を読取結果として出力す
るものを表わしている。FIG. 1 shows an embodiment of the present invention, in which 1 is a character recognition unit, 2 is a character recognition unit;
3 is a candidate category selection means based on priority, 3 is a candidate memory that stores the candidate category name and priority for each character, 4 is a character pointer, and 5 is a character type control means that selects a candidate category based on the contents of the candidate memory 3. What determines candidate categories;
12 is a result memory that stores three types of information for each character: candidate category name, priority, and character type code; 6 is a forward subcategory that determines a candidate category for a semi-selected character based on the character type code of the immediately preceding character; A processing unit 13 is a knowledge table storing permissible/prohibited information on the context of two adjacent characters; 1 is a forward knowledge processing unit which checks the context between the character and the immediately preceding character using the knowledge table 13; 8 is a backward subcategory processing unit that re-determines the candidate category of the immediately preceding character based on the character type code of the character; 9 is a backward knowledge processing unit that corrects the candidate category of the character concerned and the immediately preceding character; 10 is a field end detection unit that determines whether the character in question is the end of the field; 11 is a result output unit that corrects the latter candidate category; 11 is a result output unit; This indicates that the contents of the file are output as the reading results.

次に、上記実施例の動作について順を追って説明する。Next, the operation of the above embodiment will be explained in order.

本実施例においては、認識論理上の各サブカテゴリに対
し、他カテゴリとの混同され得る程度にもとづき第５図
の定義に従い、「０」。In this embodiment, "0" is assigned to each subcategory in recognition logic according to the definition shown in FIG. 5, based on the degree to which it can be confused with other categories.

「１」、「２」の３段階の優先度を予め付与しておく。Three levels of priority, "1" and "2", are assigned in advance.

以降、各優先度毎のサブカテゴリの総称名として５Ｏ２
ＳＩｔ８２と呼ぶことにする。From now on, 5O2 will be used as a generic name for the subcategories for each priority level.
It will be called SIt82.

また第５図の上段のサブカテゴリであるほど優先度は高
いものとする。Further, it is assumed that the higher the subcategory is in the upper row of FIG. 5, the higher the priority.

先ず、認識部１が帳票上の個々の文字について認識論理
に合致したサブカテゴリ対応のカテゴリ名や優先度を順
次転送する。First, the recognition unit 1 sequentially transfers category names and priorities corresponding to subcategories that match the recognition logic for each character on the form.

これらを入力とするカテゴリ選択手段２では、文字ごと
に優先度の高い候補カテゴリを選択し、候補メモリ３に
書込む。The category selection means 2 which receives these as input selects a candidate category with a high priority for each character and writes it into the candidate memory 3.

候補メモリ３は、第２図に一実施例を示すようになって
おり、最大５組（Ｐ（ｉｔ１）：カテゴリ名、
ＰＣｉ、２）：優先度、ｉ＝１．・・・。An example of the candidate memory 3 is shown in FIG.
PCi, 2): Priority, i=1. ....

５）の候補を文脈フィールド（意味的に関連した帳票上
の記入領域）内の文字数ｎ個分を記憶出来るメモリであ
る。5) is a memory capable of storing candidates for n characters in a context field (an entry area on a form that is semantically related).

第３図はカテゴリ選択手段２の処理態様を示したもので
、３１が帳票上の文字例、３２が認識部１から出力され
た候補例、３３がカテゴリ選択手段２で選択された候補
を、候補メモリ３０当該文字に該当する部分に書き込ん
だものである。FIG. 3 shows the processing mode of the category selection means 2, in which 31 is an example of a character on a form, 32 is an example of a candidate output from the recognition unit 1, and 33 is a candidate selected by the category selection means 2. This is written in the part of the candidate memory 30 corresponding to the character.

例えば文字例３１を認識部で処理した結果、文字“ワ“
であるものとしたとき第５図に対応する優先度「１」の
ものとして認識され文字“り“であるものとしたとき優
先度「２」のものとして認識され、文字“７“であるも
のとしたとき優先度「１」のものとして認識されたとす
る。For example, as a result of processing character example 31 in the recognition unit, the character “wa”
If it is the character "ri", it will be recognized as having a priority of "1" corresponding to Figure 5, and if it is the character "ri", it will be recognized as having a priority of "2", and the character "7" will be recognized. Suppose that it is recognized as having a priority of "1".

この場合、文字“ワ“および“７“とじて候補メモリ３
に書込まれる。In this case, the characters “wa” and “7” are added to the candidate memory 3.
written to.

文脈フィールド全体の候補カテゴリ情報が候補メモリ３
に記憶されると、フィールドの先頭文字から順に候補メ
モリ３、知識テーブル１３の内容をもとに字種制御部５
、順方向サブカテゴリ処理部６、順方向知識処理部７、
逆方向サブカテゴリ処理部８、逆方向知識処理部９で処
理がなされ、結果メモリに各々の結果が書き込まれる。Candidate category information for the entire context field is stored in candidate memory 3.
When stored in the character type control unit 5 based on the contents of the candidate memory 3 and the knowledge table 13 in order from the first character of the field.
, forward subcategory processing unit 6, forward knowledge processing unit 7,
Processing is performed by the backward subcategory processing section 8 and the backward knowledge processing section 9, and the respective results are written into the result memory.

文字ポインタ４は、上述の５個の処理の処理対象文字を
指定するもので、最初は先頭文字を指定し、またフィー
ルドエンド検出部１０でエンド検出のないときには、順
に次の文字を指定する役割を持つ。The character pointer 4 specifies the character to be processed in the five processes described above, and its role is to specify the first character at first, and to specify the next character in order when the field end detection unit 10 does not detect an end. have.

字種制御部５の役割は候補メモリ３の内容をもとに各文
字の候補カテゴリを暫定的に決定しくただし決定不能の
時にはリジェクトする）、結果メモリ１２に書き込むも
のである。The role of the character type control section 5 is to provisionally determine a candidate category for each character based on the contents of the candidate memory 3 (however, if the category cannot be determined, it is rejected), and to write it into the result memory 12.

文字ポインタ４が今フィールド内の第を番目の文字を指
定したとき、対応した候補メモリ３の内容ＰｔＣ第２図
におけるＰｔ（ｌｔＪ）、ただしｉ＝１，２
．・・・・・・５゜ｊ””１，２）をもとに第６図に従
って当該文字の候補カテゴリ情報Ｄｔを決定し、結果メ
モリ１２に書き込む。When the character pointer 4 specifies the th character in the current field, the corresponding content of the candidate memory 3 PtC in FIG. 2 P t (lt J ), where i = 1, 2
．． ...5゜j''''1, 2), the candidate category information Dt of the character is determined according to FIG. 6, and is written into the result memory 12.

ここで結果メモリ１２の一実施例構成は第４図の通り個
々の文字について以下の３つの情報を格納したものであ
り、上記当該文字についてＤｔ（１）に候補カテゴリ名
Ｓまたはカテゴリ決定不能を表わすリジエク）Ｒ，、Ｄ
ｔ（２）に優先度（０，〜、３）、Ｄｔ（３）に文字種
コード（本実施例ではＯ：ドントケヤ、１：英字、２：
記号、３；カナ、４：記号）を各々割当てている。Here, one embodiment of the structure of the result memory 12 stores the following three pieces of information for each character as shown in FIG. R,, D
t(2) is the priority (0, ~, 3), Dt(3) is the character type code (in this example, O: don't care, 1: alphabet, 2:
symbol, 3: kana, and 4: symbol) are respectively assigned.

なお第６図においてＤ７（１）にある「Ｓ」とはＰｔ
内に存在する候補カテゴリ名を意味する。In addition, in Figure 6, "S" in D7(1) is Pt
means the candidate category name that exists within.

また、リジェクトの場合もカテゴ！、ｌｌ［Ｊとみな
し、４段階の優先度を設は第７図のように定義する。Also, in case of rejection, there is also a category! , l l[J, and four levels of priority are defined as shown in FIG.

なお、第７図の上段のりジェツトはど優先度が高く、ま
た以降優先度毎のりジェツトの総称名として、Ｒｏ。Note that the glue jets in the upper row of FIG. 7 have the highest priority, and from now on, Ro is used as a generic name for the glue jets for each priority level.

Ｒ，、Ｒ２，Ｒ３と呼ぶ。They are called R,, R2, and R3.

金策を番目の文字の識別結果が例えば優先度「１」のサ
ブカテゴリが１ケだけであった時、ＰｔにはＳｌが１ケ
となり、第６図に従うと下から第５欄に示す如く、結果
メモリＤｔには該サブカテゴリの属すカテゴリ名Ｓと、
その優先度「１」、そのカテゴリの文字種コードが夫々
格納される。For example, when the identification result of the th character of the money plan is that there is only one subcategory with priority "1", there will be one Sl in Pt, and according to Figure 6, the result will be as shown in the fifth column from the bottom. The memory Dt stores the category name S to which the subcategory belongs, and
The priority level "1" and the character type code of the category are respectively stored.

次に、゛順方向サブカテゴリ処理部６は直前文字（ｔ−
１番目）の文字種コード（Ｄｔ−、（３））をもとに当
該文字のＤｔのあいまいさを正すものである。Next, the forward subcategory processing unit 6 processes the immediately preceding character (t-
The ambiguity of the Dt of the character is corrected based on the character type code (Dt-, (3)) of the first character.

当該文字のＤｔ（１，２）が８２、Ｒ，。Ｒ２のい
ずれかのとき（当該文字の結果メモリの内容があいまい
であるとみなされたとき）Ｄ）、−＋（３）をもとに第
８図に従って候補メモリ３の内容Ｐｉから直前文字と同
一文字種の候補カテゴリを引き出す形でＤｔ（１）
、Ｄｔ（２）、Ｄｔ＜３＞を更新する。The Dt(1,2) of the character is 82, R,. In either case R2 (when the contents of the resulting memory of the character concerned are considered ambiguous) D), the immediately preceding character is determined from the contents Pi of the candidate memory 3 according to Fig. 8 based on -+(3). D t (1) in the form of drawing candidate categories of the same character type.
, D t(2) , and D t<3> are updated.

なおり７（１，２）とは結果メモリのＤ７（１）とＤ
Ｉ（２）とを併せて示したもので優先度付きの候補カテ
ゴリ名かりジェツトを与え、また第８図において結果メ
モリの項の「−」はドントケヤを、また結果Ｄｔ（２）
の項のｉは一致した候補カテゴリの優先度を意味してい
る。Naori 7 (1, 2) is result memory D 7 (1) and D
The result Dt(2) is shown together with I(2) to give a candidate category name list with priority, and in FIG.
i in the term means the priority of the matched candidate category.

例えば直前文字の文字種が数字であり、当該文字につい
て数字の１ケのサブカテゴリとカタカナの１ケのサブカ
テゴリとが抽出されて、当該文字が数字とカタカナとの
複数候補サブカテゴリを同時に満足したりジェツト（例
えばＲ，）であった時、第８図に従えば下から第７欄に
示す如く、Ｄ、／７＋（３）＼０でかつＤｌ−＋（３）
と同一文字種のカテゴリ数が数字に関するカテゴリ１ケ
となるため、当該文字の候補カテゴリとして数字が択一
される。For example, if the character type of the immediately preceding character is a number, one subcategory of numbers and one subcategory of katakana are extracted for the character, and the character satisfies multiple candidate subcategories of numbers and katakana at the same time. For example, when R,), according to Figure 8, as shown in the 7th column from the bottom, D, /7+(3)\0 and Dl-+(3)
Since the number of categories for the same character type is one for numbers, numbers are selected as the candidate category for the character.

次に順方向知識処理部７は直前文字のカテゴリ名（Ｄｔ
−ｔ（１））と知識テーブル１３とをもとに当該文字の
Ｄｔのあいまいさを正すものである。Next, the forward knowledge processing unit 7 uses the category name (Dt
-t(1)) and the knowledge table 13 to correct the ambiguity of Dt of the character.

当該文字のＤｔ（１，２）の内容が８．．８２．Ｒ２で
かつ、直前文字のＤｌ−、（１、２）がＳ。The content of Dt(1,2) of the character is 8. ．． 82. In R2, the immediately preceding character Dl-, (1, 2) is S.

、Ｓｌ。Ｓ２のいずれかのとき、Ｄ、！＋（１）を
もとに第９図に従って候補メモリ３の内容Ｐｔの条件に
もとづきＤｔ（１）、Ｄｔ（２）、Ｄｔ
（３）を更新する。, Sl. At any time in S2, D,! + Based on (1) and according to the condition of the content Pt of the candidate memory 3 according to FIG. 9, D t(1) , D t(2) , D t
Update (3).

第９図において、Ｄ、！（３）の項のｊとはＰｔ内の許
容文字のカテゴリに対応した文字種コードを意味する。In Figure 9, D,! In the item (3), j means the character type code corresponding to the category of allowable characters in Pt.

なお知識テーブル１３は、帳票の目的・用途に応じて読
取対象とするすべてのカテゴリの種々の組合せについて
、その前後関係を禁止するかあるいは許容するのかを指
示する。Note that the knowledge table 13 instructs whether to prohibit or permit the context of various combinations of all categories to be read, depending on the purpose and use of the form.

例えば、金額を記入する帳票の場合、′￥“（円サイン
）の後のカナ文字を禁止し、数字を許容するというよう
な条件を知識テーブル１３に盛り込む。For example, in the case of a form in which an amount is to be entered, the knowledge table 13 includes conditions such as prohibiting kana characters after ``¥'' (yen sign) and allowing numbers.

この時、直前文字がｎＹｎと判定され、当該文字が数
字とカタカナとの複数候補サブカテゴリを同時に満足し
たりジェツトＲ２であった時、第９図に従えばＰｔ内の
許容文字数が数字の候補カテゴリ１ケとなり、当該文字
の候補カテゴリとして数字が択一される。At this time, if the immediately preceding character is determined to be n Yn, and the character satisfies multiple candidate subcategories of numbers and katakana at the same time, or is jet R2, then according to Figure 9, the allowable number of characters in Pt is a number candidate. There is one category, and numbers are selected as candidate categories for the character.

次に逆方向サブカテゴリ処理部８は、当該文字の文字種
コード（Ｄｔ（３））をもとに、直前文字のＤＦ、のあ
いまいさを正すものである。Next, the backward subcategory processing unit 8 corrects the ambiguity of the immediately preceding character DF based on the character type code (Dt(3)) of the character.

順方向サブカテゴリ処理部６で直前文字の文字種コード
で当該文字の正解と思われる候補カテゴリを択一しよう
とする処理であったが、ここではその逆方向の処理を行
う。The forward subcategory processing unit 6 attempts to select a candidate category that is considered to be the correct answer for the character based on the character type code of the immediately preceding character, but here, processing is performed in the opposite direction.

直前文字のり、／、−＋（１ｔ２）がＲ２で、か
つ当該文字のＤｔ（１，２）がＳ。The immediately preceding character paste, /, -+ (1t 2 ) is R2, and the character D t (1,2) is S.

ｊｓ、ｔｓ２のいずれかのとき、Ｄ７（３）をもと
に第１０図に従って候補メモリ３の内容Ｐ、／＝、
の条件にもとづきＤｚ−１（１）、Ｄｚ −ｔ（２
）、Ｄｚ −＋（３）を更新する。When either j s or ts2, the content of candidate memory 3 P , /= , according to FIG. 10 based on D7(3)
Based on the conditions of Dz-1(1), Dz-t(2
), Dz −+ (3) is updated.

次に逆方向知識処理部９は、当該文字のカテゴリ名（Ｉ
Ｍ（１））と知識テーブル１３とをもとに、直前文字の
Ｄｔ、のあいまいさを正すものである。Next, the backward knowledge processing unit 9 processes the category name (I
M(1)) and the knowledge table 13, the ambiguity of the immediately preceding character Dt is corrected.

順方向知識処理部１で直前文字のカテゴリの内容をもと
に、カテゴリの連接関係から候補カテゴリを択一しよう
とする処理であったが、ここではその逆方向の処理を行
う。Although the forward knowledge processing unit 1 attempted to select a candidate category based on the category concatenation relationship based on the content of the category of the immediately preceding character, here, the process is performed in the opposite direction.

直前文字のＤｔ、（１，２）がＲ２で、かつ当該文
字のり、４（１，２）がＳｏ、Ｓｌ。The immediately preceding character, Dt, (1, 2) is R2, and the character paste, 4 (1, 2), is So, Sl.

Ｓ２のいずれかのとき、Ｄ４１）をもとに第１１図に従
って候補メモリ３の内容ＰＬ、から改めて結果を決
定し、Ｄｔ−１（”）ｔＤ、／、−１（２）ｔ
Ｄｌ−□（３）を更新する。At any time in S2, the result is determined again from the content PL of the candidate memory 3 according to FIG. 11 based on D41), and Dt-1('') t D, /, -1(2) t
Update Dl-□(3).

次にフィールドエンド検出部１０は、当該文字が文脈フ
ィールドの終端文字か否かを判定し、次の文字に関し、
上述の各処理を繰り返すか、または結果メモリ１２の内
容を結果出力部１１を通し読取結果として出力する。Next, the field end detection unit 10 determines whether or not the character is the last character of the context field, and regarding the next character,
The above-mentioned processes are repeated, or the contents of the result memory 12 are outputted as a read result through the result output section 11.

第１２図は本実施例における記述上わかち書きを原則と
する具体例を用いた文字選択処理を説明する説明図であ
る。FIG. 12 is an explanatory diagram illustrating a character selection process using a specific example in which the writing is based on the writing in margins in this embodiment.

５１は帳票上の記入文字、５２は認識部から得た結果、
５３は候補メモリの内容、５４は字種制御後の結果メモ
リ、５５は順方向及び逆方向のサブカテゴリ及び知識処
理の実行に関する制御条件、５６は全処理を終えた結果
メモリである。51 is the entered character on the form, 52 is the result obtained from the recognition unit,
53 is the content of the candidate memory, 54 is the result memory after character type control, 55 is the control condition regarding forward and backward subcategories and execution of knowledge processing, and 56 is the result memory after all processing is completed.

以下これらをもとに順を追って説明する。記入文字５１
は「カナガワケン（神奈川糸列のカナ７文字とする。The following is a step-by-step explanation based on these. Entry character 51
is ``Kanagawaken'' (seven characters of Kanagawa Itoretsu kana).

認識部から出力した文字ごとの認識結果５２１が図示の
候補カテゴリとその優先度とであるとする。It is assumed that the recognition result 521 for each character output from the recognition unit is the illustrated candidate category and its priority.

文字選択処理はこれらの結果をもとに開始する。Character selection processing is started based on these results.

即ち、先ずカテゴリ選択手段２により上位カテゴリのみ
を候補メモリ５３に記述する。That is, first, only the higher-rank categories are written in the candidate memory 53 by the category selection means 2.

ここでは５２１の「ヤ」、５２３のｒ７Ｊ、５２５の「
）」が各々除かれた。Here, 521 “ya”, 523 r7J, 525 “
)” were removed.

次に先頭文字について字種制御を行なう。Next, control the character type for the first character.

５３１の内容をもとに第６図の条件から候補カテゴリ名
、優先度、文字種コード（３：カナ）を５４１に書き込
む。Based on the contents of 531 and the conditions shown in FIG. 6, the candidate category name, priority, and character type code (3: kana) are written in 541.

次に、当該文字に関する順方向及び逆方向の各処理の実
行に関する制御条件は５５１の通りで、○印が実行箇所
である。Next, the control conditions for executing each process in the forward direction and backward direction regarding the character are as shown in 551, and the ◯ marks are the execution points.

ここで５５は上から順方向サブカテゴリ処理、順方向知
識処理、逆方向サブカテゴリ処理、逆方向知識処理に関
した条件である。Here, 55 is a condition related to forward subcategory processing, forward knowledge processing, backward subcategory processing, and backward knowledge processing from the top.

５４１については、先頭文字のため、当該文字のカテゴ
リが先頭に出現し得るか否かのみを順方向知識処理で確
認し、その結果が５６１となる。Since 541 is the first character, forward knowledge processing is performed to check only whether the category of the character can appear first, and the result is 561.

次に５１２の文字に関する処理を行なう。５３２をもと
に字種制御を行なうと第６図の条件から５４２を得る。Next, processing regarding the character 512 is performed. When character type control is performed based on 532, 542 is obtained from the conditions shown in FIG.

５４２の優先度が「２」のため５５２の通り順方向の各
処理を実行するが、直前文字がリジェクトでないため（
結果５６１から）逆方向の各処理は実行しない。Since the priority of 542 is "2", each process in the forward direction as shown in 552 is executed, but since the immediately preceding character is not rejected (
From result 561), each process in the reverse direction is not executed.

本具体例においては以後５４２に引続く文字についても
同様に順方向処理で必ず結果が判定できるので、逆方向
処理は実行されない。In this specific example, since the results of the characters subsequent to 542 can be determined by forward processing in the same way, reverse processing is not performed.

第７図、第８図図示の諸条件から、５５２をもとに５６
２を得る。Based on the conditions shown in Figures 7 and 8, 56
Get 2.

５３３から５６３を得る過程は明らかであるので説明を
省く。Since the process of obtaining 563 from 533 is obvious, the explanation will be omitted.

５１４の文字については、５３４をもとに字種制御を行
なうと第６図から５４４を得る。Regarding the character 514, if character type control is performed based on 534, 544 is obtained from FIG.

５４４の優先度は最高（Ｄ（２）＝Ｏ）であるため５
４４の通り、各処理を実行せず５６４を得る。The priority of 544 is the highest (D(2)=O), so it is 5.
44, 564 is obtained without executing each process.

なお、優先度「０」であれば分かち書きの原則に従わず
とも、そのカテゴリが最優先で採用する。Note that if the priority is "0", that category will be adopted with the highest priority even if the separation principle is not followed.

次に５１５に関する処理については、５３５において“
ワ“と“７“の異父字種が同時出現したもので、字種制
御の結果、５４５０通りリジェクト、優先度１、文字種
コードＯ（ドントケヤ）となる。Next, regarding the process related to 515, in 535 “
The different character types of ``wa'' and ``7'' appear simultaneously, and as a result of character type control, 5450 ways are rejected, priority is 1, and character type code O (don't care).

５４５をもとに、５５５の通り順方向の２処理だけを行
ない結果として、直前文字の文字種コードと同類のカテ
ゴリ名が改めて選択され、′ワ“を結果として５６５に
書き込む。Based on 545, only two forward processes are performed as in 555, and as a result, a category name similar to the character type code of the immediately preceding character is selected again, and 'Wa' is written in 565 as a result.

以下５３６゜５３７をもとにした処理も同手順で実行さ
れ、５６６．５６７を得る。The following processing based on 536°537 is also executed in the same procedure to obtain 566.567.

５１が１つの文脈フイールドとすると、これらの処理の
後５６のカテゴリ名を読取結果として出力する。If 51 is one context field, after these processes, 56 category names are output as the reading results.

以上説明した如く、本発明によれば、認識部で得られた
結果が複数候補出現によるリジェクトまたはあいまいな
優先度の低いサブカテゴリが出現したとき、当該文字の
直前または直後の結果、すなわちカテゴリ名、文字種コ
ード等を照会することにより、単一候補を選択抽出した
り、誤読を無くすことができる。As explained above, according to the present invention, when the result obtained by the recognition unit is rejected due to the appearance of multiple candidates or an ambiguous low-priority subcategory appears, the result immediately before or after the character in question, that is, the category name, By inquiring about character type codes, etc., it is possible to select and extract a single candidate and eliminate misreading.

また上記の文字選択処理方式により地文字種に対する認
識論理の内容変更および規模拡張を行なわずして高精度
な混在読み取りが実現できる。Furthermore, the above character selection processing method allows highly accurate mixed reading to be achieved without changing the content or expanding the scale of the recognition logic for ground character types.

[Brief explanation of the drawing]

第１図は本発明の一実施例の構成図、第２図は候補メモ
リの構成図、第３図は候補カテゴリ選択手段の処理を説
明する説明図、第４図は結果メモリの構成図、第５図は
サブカテゴリ優先度に関する定義を説明する説明図、第
６図は字種制御処理の内容を説明する説明図、第７図は
りジエクト優先度に関する定義を説明する説明図、第８
図は順方向サブカテゴリ処理の内容を説明する説明図、
第９図は順方向知識処理の内容を説明する説明図、第１
０図は逆方向サブカテゴリ処理の内容を説明する説明図
、第１１図は逆方向知識処理の内容を説明する説明図、
第１２図は具体例を用いて本発明による処理を説明する
説明図を示す。１・・・認識部、２・・・候補カテゴリ選択手段、３・
・・候補メモリ、４・・・文字ポインタ、５・・・字種
制御部、６・・・順方向サブカテゴリ処理部、７・・・
順方向知識処理部、８・・・逆方向サブカテゴリ処理部
、９・・・逆方向知識処理部、１０・・・フィールドエ
ンド検出部、１１・・・結果出力部、１２・・・候補メ
モリ、１３・・・知識テーブル、３１・・・帳票上の記
入文字例、３２・・・認識部の判定結果、３３・・・候
補メモリ、５１・・・帳票上の記入文字例、５２・・・
認識部の判定結果、郭・・・候補メモリ、５４・・・字
種制御処理後の結果メモリ、５５・・・処理実行に関す
る制御条件、５６・・・文字選択処理済みの結果メモリ
。FIG. 1 is a block diagram of an embodiment of the present invention, FIG. 2 is a block diagram of a candidate memory, FIG. 3 is an explanatory diagram explaining the processing of the candidate category selection means, and FIG. 4 is a block diagram of a result memory. FIG. 5 is an explanatory diagram for explaining the definition regarding subcategory priority, FIG. 6 is an explanatory diagram for explaining the content of character type control processing, FIG. 7 is an explanatory diagram for explaining the definition for project priority, and FIG.
The figure is an explanatory diagram explaining the contents of forward subcategory processing,
Fig. 9 is an explanatory diagram explaining the contents of forward knowledge processing;
FIG. 0 is an explanatory diagram explaining the contents of backward subcategory processing, FIG. 11 is an explanatory diagram explaining the contents of backward knowledge processing,
FIG. 12 shows an explanatory diagram for explaining the processing according to the present invention using a specific example. 1... Recognition unit, 2... Candidate category selection means, 3.
... Candidate memory, 4... Character pointer, 5... Character type control unit, 6... Forward subcategory processing unit, 7...
Forward knowledge processing unit, 8... Reverse subcategory processing unit, 9... Backward knowledge processing unit, 10... Field end detection unit, 11... Result output unit, 12... Candidate memory, 13...Knowledge table, 31...Example of written characters on form, 32...Judgment result of recognition unit, 33...Candidate memory, 51...Example of written characters on form, 52...
Judgment result of the recognition unit, Kaku: Candidate memory, 54: Result memory after character type control processing, 55: Control conditions regarding process execution, 56: Result memory after character selection processing.

Claims

[Claims]

1. In a character recognition device that reads target characters that are composed of multiple character type groups,
The recognition unit of the character recognition device is configured to output a plurality of candidate categories including a plurality of candidate subcategories, and the post-processing unit compares the priorities of the candidate subcategories and selects a candidate category with a high priority. A character type control process to select, a subcategory process to select a candidate category that is considered to be the correct answer for the character based on the connection relationship between the candidate category of the character immediately before or after the character and the candidate category of the character, and or knowledge processing that selects a candidate category that is considered to be the correct answer for the character based on the category itself's connection relationship between the candidate category of the character immediately after, the candidate category of the character, and the candidate category of the character; The character reading device is configured such that the priority given to each subcategory to select candidate subcategories obtained from the recognition unit can be set in advance. A character selection processing method characterized by being classified into a subcategory group that agrees with and can be clearly distinguished from all other categories, a subcategory group that can be clearly distinguished from all categories of an allopathic character group, and others.