JPH0632092B2

JPH0632092B2 - Word matching method

Info

Publication number: JPH0632092B2
Application number: JP63104635A
Authority: JP
Inventors: 雅己小黒; 清仲林; 直孝大光明
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 1988-04-27
Filing date: 1988-04-27
Publication date: 1994-04-27
Anticipated expiration: 2009-04-27
Also published as: JPH01276285A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、単語照合方式に関し、特に入力文字列の文字
位置について候補文字が複数個存在する場合に、入力文
字に誤りがあっても、単語を能率よく認識するための単
語認識方式に関するものである。Description: TECHNICAL FIELD The present invention relates to a word matching method, and particularly when there are a plurality of candidate characters for the character position of an input character string, even if there is an error in the input character, The present invention relates to a word recognition method for efficiently recognizing words.

[Conventional technology]

従来の単語認識方式として、文字コードの並びで表わさ
れる入力文字列から単語を認識する際に、入力文字列の
各文字位置に対して候補文字が複数個あり、かつ入力文
字列中に誤字、脱字、余剰字を含んでいる場合を想定し
て、入力文字列の各位置の各候補文字と、任意の単語を
構成する１個以上の位置の文字とを照合する必要がある
場合がある。As a conventional word recognition method, when recognizing a word from an input character string represented by a sequence of character codes, there are a plurality of candidate characters for each character position of the input character string, and a typographical error in the input character string, In some cases, it is necessary to collate each candidate character at each position of the input character string with the characters at one or more positions that form an arbitrary word, assuming that the characters include omitted characters and surplus characters.

第６図は、従来のこのような単語認識装置の一例を示す
構成図である。FIG. 6 is a block diagram showing an example of such a conventional word recognition device.

第６図において、１は単語選択テーブルであり、２は選
択処理部であり、３は判定値テーブルであり、４は判定
値累積処理部であり、５はソート処理部であり、６は評
価値テーブルである。例えば、入力文字列として『デコ
タル』（この中のコはジを誤ってコと入力したものと想
定する）を入力すると、選択処理部２の単語選択テーブ
ル１を参照することにより、１文字目、２文字目、・・
・・の各々に対して単語コードと判定値を出力し、これ
らを次段の判定値累積処理部４に転送する。判定値累積
処理部４では、入力された各単語コードに対して判定値
テーブル３を参照することにより評価値を決定し、次段
のソート処理部５に転送する。ソート処理部５では評価
値テーブルにより候補単語に順位を付けて出力する。順
位の最高のものが、認識された文字である。In FIG. 6, 1 is a word selection table, 2 is a selection processing unit, 3 is a judgment value table, 4 is a judgment value accumulation processing unit, 5 is a sort processing unit, and 6 is an evaluation. It is a value table. For example, if "Decotal" is input as the input character string (it is assumed that the character "ko" is mistakenly input as "ko"), the word selection table 1 of the selection processing unit 2 is referred to and the first character is input. Second character ...
.. outputs the word code and the judgment value to each of them, and transfers them to the judgment value accumulation processing unit 4 in the next stage. The judgment value accumulation processing unit 4 determines the evaluation value for each input word code by referring to the judgment value table 3, and transfers the evaluation value to the next-stage sort processing unit 5. The sort processing unit 5 ranks and outputs the candidate words according to the evaluation value table. The highest ranked are the recognized letters.

この単語認識装置では、各文字コードの並びを各単語に
対応させた単語辞書を用いて、入力文字列の文字コード
の並びと各単語の比較を行うために、入力文字列の各文
字位置ごとに入力文字列を構成する文字コードを入力す
る。すなわち、前述のように、『デコタル』を入力する
と、１文字目の『デ』に対してデジタルのＷ１１、デー
タのＷ１２、デコーダのＷ１３が出力し、２文字目の
『コ』に対してデコーダのＷ１３、レコードのＷ２５を
出力する。以下、同じようにして、入力された各文字ご
とに、単語コードと判定値が出力される。このように、
選択処理部２では、単語選択テーブル１により、文字コ
ードの文字を各文字位置に含む単語コードを出力する。
この単語認識装置では、誤字、余剰字、脱字に対応する
ため、辞書内の単語コードにおける文字位置（以下、辞
書内位置と呼ぶ）と入力文字列内の文字位置（以下、認
識時位置と呼ぶ）が一致しない場合についても、１箇所
の認識時位置の候補文字につき、１箇所以上の辞書内位
置で単語選択テーブルの検索を行っている。In this word recognition device, in order to compare each character code sequence of the input character string and each word using a word dictionary in which each character code sequence is associated with each word, each character position of the input character string Enter the character code that constitutes the input character string in. That is, as described above, when "Decotal" is input, the digital W11, the data W12, and the decoder W13 output for the first character "De", and the decoder W13 for the second character "Co". W13 and W25 of the record are output. Thereafter, in the same manner, the word code and the judgment value are output for each input character. in this way,
The selection processing unit 2 outputs the word code including the characters of the character code at each character position by using the word selection table 1.
In this word recognition device, since it corresponds to erroneous characters, surplus characters, and missing characters, the character position in the word code in the dictionary (hereinafter, referred to as the position in the dictionary) and the character position in the input character string (hereinafter, referred to as the position at the time of recognition) ) Does not match, the word selection table is searched at one or more positions in the dictionary for each candidate character at the recognition position.

なお、従来の単語照合方式については、例えば、先に出
願された特願昭６１−２４８４１５号明細書および図面
（発明の名称：単語認識装置）を参照されたい。For the conventional word matching method, refer to, for example, the specification of Japanese Patent Application No. 61-248415 filed previously and the drawings (title of the invention: word recognition device).

[Problems to be Solved by the Invention]

このような単語認識装置においては、１箇所の認識時位
置の候補文字が１個以上存在し、かつその候補文字に対
して１箇所以上の辞書内位置で単語選択テーブルを検索
しているため、次に説明するような誤った評価値が出力
されてしまう。従って、このような誤出力を修正すべき
課題が生じる。In such a word recognition device, there is at least one candidate character at the recognition-time position, and the word selection table is searched at one or more positions within the dictionary for the candidate character. An incorrect evaluation value will be output as described below. Therefore, there arises a problem that such an erroneous output should be corrected.

第７図は、従来の単語認識装置による具体的な処理過程
を示す図である。FIG. 7 is a diagram showing a specific processing procedure by the conventional word recognition device.

文字認識においては、第７図に示すように、１箇所の認
識時位置に１個以上の文字候補が出力される。すなわ
ち、単語辞書に、東京、京都、都心があり、入力文字列
として『京都』を入力した場合、認識時位置ｃ１（キャ
ラクタ１字目）内の候補文字として、東、京、亨の３文
字、ｃ２内の候補文字として、都、郡、群の３文字がそ
れぞれ出力される。つまり、同じ認識時位置内の候補文
字だけで、単語として成立する場合（『東京』）があ
る。この場合、前記手法を用いると、第７図の選択処理
部２において、認識時位置ｃ１文字目を辞書内位置１文
字目とみたとき、に候補文字『東』から『東京』が取
り出され、ｃ１を辞書内位置２文字目とみたとき、に
候補文字『京』から『東京』が取り出される。このと
き、１箇所の認識時位置では、１個の候補文字しか使用
できないにもかかわらず、『東京』の検索では、１箇所
の認識時位置で２個の候補文字が用いられ、誤った評価
値が与えられることになる。すなわち、判定値累積部で
は、『東京』が1.7、『京都』が1.6、『都心』が0.8の
評価となり、次のソート処理部では、『東京』が1.7、
『京都』が1.6、『都心』が0.8となる。従って、従来の
方法では、正確な処理が行われていないことになる。In character recognition, as shown in FIG. 7, one or more character candidates are output at one recognition position. That is, if there are Tokyo, Kyoto, and Tokyo in the word dictionary and "Kyoto" is entered as the input character string, the candidate characters in the recognition-time position c1 (the first character) are three characters, east, Kyo, and Toru. , C2, three characters, that is, capital, county, and group, are output. In other words, there is a case where only candidate characters within the same recognition position are satisfied as a word (“Tokyo”). In this case, when the above method is used, in the selection processing unit 2 of FIG. 7, when the first character at the recognition position c is regarded as the first character in the dictionary, “Tokyo” is extracted from the candidate characters “east”, When c1 is regarded as the second character in the dictionary, “Tokyo” is extracted from the candidate characters “Kyo”. At this time, although only one candidate character can be used at one recognition-time position, two candidate characters are used at one recognition-time position in the search for "Tokyo", resulting in incorrect evaluation. A value will be given. That is, in the judgment value accumulating unit, "Tokyo" is evaluated as 1.7, "Kyoto" is evaluated as 1.6, and "Toshin" is evaluated as 0.8, and in the next sort processing unit, "Tokyo" is evaluated as 1.7,
"Kyoto" will be 1.6 and "Toshin" will be 0.8. Therefore, the conventional method does not perform accurate processing.

これが原因となって、第７図の例では、本来第１位とし
て抽出されるべき『京都』が第２位となって出力されて
いる。Due to this, in the example of FIG. 7, “Kyoto” which should be originally extracted as the first place is output as the second place.

本発明の目的は、このような従来の課題を解決し、１つ
の認識位置について１つ以上の辞書内位置で文字の照合
を行う場合、誤って評価値を出力することなく、かつ入
力文字中に誤字、脱字、余剰字があっても、高精度に単
語を認識することができ、また並列処理を行う場合で
も、高精度に単語を認識することができる単語照合方式
を提供することにある。An object of the present invention is to solve such a conventional problem and to perform collation of characters at one or more in-dictionary positions with respect to one recognized position without incorrectly outputting an evaluation value and It is to provide a word matching method capable of recognizing a word with high accuracy even if there is a typographical error, omission, or surplus character, and also capable of recognizing a word with high accuracy even when performing parallel processing. .

[Means for Solving the Problems]

上記目的を達成するため、本発明の単語照合方式は、文
字コードが配列された入力文字列が、予め設定された単
語の中のどの単語に相当するかを照合する単語照合方式
において、単語を構成する文字配列の各位置（以下、辞
書内位置と記す）ごとに、使用文字と単語とを対応させ
た単語選択テーブルを含み、入力文字列の任意の文字位
置（以下、認識時位置と記す）の候補文字の文字コード
と辞書内位置を入力として、対応する単語コードを出力
する選択処理手段と、上記単語コードごとに該単語コー
ドの選択に用いた文字コードの認識時位置を記録した位
置履歴テーブルを含み、単語コードおよび該単語コード
の選択に用いた文字コードの認識時位置を入力として、
該単語コードの認識時位置について、他の候補文字で検
索済みか否かの検索履歴を出力する位置検定手段と、該
位置検定手段から出力された各単語コードについて、確
実さを示す評価値を加算する判定値累積手段と、該判定
値累積手段の判定値の大小関係より候補単語を抽出する
ソート手段とを有し、上記選択処理手段により入力文字
列の各認識位置ごとの候補文字について、１箇所以上の
辞書内位置ごとに上記単語選択テーブルから文字コード
に対応する単語コードを選択し、上記位置検定手段によ
り、該単語コードと該認識時位置の検索履歴を上記位置
履歴テーブルから読み出し、未検索である場合にのみ該
位置履歴テーブルに検索済みの検索履歴を設定して、該
単語コードを上記判定値累積手段に送り、該判定値累積
手段では、上記位置検定手段から出力された単語コード
の評価値を累積加算して上記入力文字列に対する判定値
を得て、上記ソート手段により候補単語を抽出すること
に特徴がある。また、上記位置履歴テーブルの検索履歴
として、入力文字列の各文字の認識結果から得られた確
からしさの尺度を書き込むことにも特徴がある。In order to achieve the above object, the word matching method of the present invention is a word matching method in which an input character string in which a character code is arranged is matched with which word among preset words. It includes a word selection table that associates the used characters with words for each position of the character array that is formed (hereinafter, referred to as a position in the dictionary), and any character position of the input character string (hereinafter, referred to as a recognition position). ) Inputting the character code of the candidate character and the position in the dictionary, the selection processing means for outputting the corresponding word code, and the position where the recognition time position of the character code used for selecting the word code is recorded for each word code. Including a history table, the word code and the position at the time of recognition of the character code used to select the word code are input,
For the position at the time of recognition of the word code, a position verification means for outputting a search history indicating whether or not another candidate character has been searched for, and an evaluation value indicating the certainty for each word code output from the position verification means are displayed. It has a judgment value accumulating means for adding, and a sorting means for extracting a candidate word based on the magnitude relationship of the judgment values of the judgment value accumulating means, and for the candidate character for each recognition position of the input character string by the selection processing means, A word code corresponding to a character code is selected from the word selection table for each position in the dictionary at one or more locations, and the position history detecting means reads the search history of the word code and the recognition-time position from the position history table. Only when it is not searched, the searched history is set in the position history table, and the word code is sent to the judgment value accumulating means. The evaluation value of the word codes output from test unit and accumulating obtain a determination value for the input character string, is characterized in that extracting the candidate words by the sorting means. Further, the search history of the position history table is characterized by writing a measure of certainty obtained from the recognition result of each character of the input character string.

[Work]

本発明においては、１箇所の認識時位置に対して、１度
の検索しか許さない方法を導入することにより、正解率
を向上している。すなわち、入力文字列の各認識時位置
に対する各候補文字ごとに、得点と認識時位置を保持
し、単語コードごとにその単語コードの選択に用いた文
字コードの認識時位置を記録する位置履歴テーブルを持
ち、単語コードとその単語コードの選択に用いた候補文
字の認識時位置を入力して、その単語コードについて、
その認識時位置が未検索である単語コードと、その単語
コードの評価値を出力する位置検定部により、選択処理
部で得られた単語コードと、その単語コードを選択した
候補文字の認識時位置とから、その認識時位置と同じ認
識時位置からの選択が既に行われていないか検定し、初
めて選択が行われる場合についてのみ、その単語コード
および評価値を判定値累積部に出力し、同時に位置検定
テーブルに検定済みを記録する。In the present invention, the accuracy rate is improved by introducing a method that allows only one search for one recognition-time position. That is, a position history table that holds the score and the recognition position for each candidate character for each recognition position of the input character string and records the recognition position of the character code used to select that word code for each word code. , And input the word code and the position at the time of recognition of the candidate character used to select that word code, and regarding that word code,
The word code whose position at the time of recognition has not been searched, and the position verification unit that outputs the evaluation value of that word code, the word code obtained by the selection processing unit, and the position at the time of recognition of the candidate character that selected that word code. From that, it is verified whether selection from the same recognition position as the recognition position has already been made, and only when the selection is made for the first time, the word code and evaluation value are output to the judgment value accumulating unit, and at the same time. Record verified in the position verification table.

これにより、入力文字列の各認識位置に１個以上の候補
文字認識が存在し、かつ入力文字列内に、誤字、脱字、
余剰字を含んでいても、１箇所の認識時位置の１個以上
の候補文字と、１箇所以上の辞書内位置の文字とを照合
する場合に生じる単語コードの誤評価を、位置履歴テー
ブルを用いて回避することにより、高精度な単語認識が
可能となる。As a result, there is at least one candidate character recognition at each recognition position of the input character string, and typographical errors, omissions,
Even if the surplus characters are included, the erroneous evaluation of the word code that occurs when one or more candidate characters at one recognition position and the character at one or more positions in the dictionary are compared can be stored in the position history table. By avoiding using it, highly accurate word recognition becomes possible.

〔Example〕

以下、本発明の実施例を、図面により詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

第１図は、本発明の一実施例を示す単語照合装置のブロ
ック構成図である。FIG. 1 is a block configuration diagram of a word matching device showing an embodiment of the present invention.

本発明の単語照合装置は、単語選択テーブル１を含む選
択処理部２と、位置履歴テーブル８を含む位置検定部７
と、評価値テーブル３を含む判定値累積部４と、候補単
語テーブル（図示省略）を含むソート処理部５とから構
成される。The word matching device of the present invention includes a selection processing unit 2 including a word selection table 1 and a position verification unit 7 including a position history table 8.
And a judgment value accumulating unit 4 including the evaluation value table 3 and a sort processing unit 5 including a candidate word table (not shown).

第２図は、本発明の単語照合方式の一実施例を示す処理
過程図である。FIG. 2 is a process chart showing an embodiment of the word matching system of the present invention.

いま、第７図と同じ入力文字別に対して、同じ文字認識
結果が得られた場合について、本発明の動作原理を説明
する。Now, the operation principle of the present invention will be described for the case where the same character recognition result is obtained for the same input character as in FIG.

選択処理部２では、各認識時位置ごとに次の選択処理を
行う。先ず、認識時位置の各候補文字の文字コードで単
語選択テーブル１を検索し、前記文字コードに対応する
単語コードを読み出す。この単語選択テーブル１は、辞
書内位置ごとに文字コードと、この辞書内位置にその文
字コードが存在する単語コードの対応表を示している。
単語認識装置では、認識時位置ｃ１については、辞書内
位置が１文字目の時、および２文字目の時につい
て、単語を選択する。では、候補文字‘東’から‘東
京’が、候補文字‘京’から‘京都’が、それぞれ読み
出され、‘東京’、評価値0.9、および認識時位置ｃ
１、‘京都’、評価値0.8、およびｃ１を、それぞれ組
合わせて出力する。また、では、候補文字‘京’から
‘東京’が読み出され、‘東京’、0.8、および認識時
位置ｃ１を組合わせて出力する。また、認識時位置ｃ２
については、辞書内位置が１文字目、２文字目の単語選
択テーブル１により、‘都内’、0.8およびｃ２、また
‘京都’、0.8、およびｃ２を、それぞれ組合わせて出
力する。このようにして、順次処理することにより、選
択された単語コードと認識時位置、評価値を組にして、
次段の位置検定部７に転送する。The selection processing unit 2 performs the following selection processing for each recognition-time position. First, the word selection table 1 is searched by the character code of each candidate character at the recognition position, and the word code corresponding to the character code is read. The word selection table 1 shows a correspondence table of a character code for each position in the dictionary and a word code in which the character code exists at the position in the dictionary.
The word recognition device selects a word for the recognition-time position c1 when the position in the dictionary is the first character and the second character. Then, the candidate characters "East" to "Tokyo" and the candidate characters "Kyo" to "Kyoto" are read out respectively, and "Tokyo", the evaluation value 0.9, and the recognition position c.
1, 'Kyoto', evaluation value 0.8, and c1 are output in combination. In addition, in, “Tokyo” is read from the candidate characters “Kyo”, and “Tokyo”, 0.8, and the recognition time position c1 are combined and output. Also, the recognition position c2
For, the word selection table 1 having the first and second characters in the dictionary outputs "Tokyo", 0.8 and c2, or "Kyoto", 0.8 and c2 in combination. In this way, by sequentially processing, the selected word code, the position at the time of recognition, and the evaluation value are paired,
It is transferred to the position verification unit 7 in the next stage.

位置検定部７では、単語選択部２から送られてきた単語
コードにより、位置履歴テーブル８からその単語コード
の認識時位置に関する検索履歴を取り出し、単語選択テ
ーブル１から送られてきた認識時位置での検索が以前行
われたか否かを検定する。第２図に示す位置履歴テーブ
ル８は、認識時位置が２文字目までの例であって、各単
語コードごと、認識時位置ｃ１、ｃ２に、処理済み(1)
か、未処理である(0)かを記録できる欄を備えている。
この位置履歴テーブル８を利用する位置検定処理の例と
して、位置履歴テーブル８が第２図に示すようにを
位置検定して得た検索履歴が記録されている状態にあ
り、次ににおいて、単語コードが、‘東京’、評価値
が0.8、認識時位置がｃ１の結果が送られてきた場合に
ついて説明する。位置履歴テーブル８により、単語コー
ド‘東京’から東京という単語を重複していることを判
別して、認識時位置ｃ１が処理済み(1)、認識時位置ｃ
２が未処理(0)であることが得られる。このことから、
単語コード‘東京’においては、選択処理部２から送ら
れてきた認識時位置ｃ１が処理済みであることが検出で
きるので、単語コード‘東京’は判定値累積部４に送ら
れない。また、を位置検定する例において、単語コー
ド‘京都’に対しては、選択処理部２から送られた認識
時位置ｃ２が未処理であることが検出され、単語コード
‘京都’は評価値とともに判定値累積部４に送られる。
このとき、位置履歴テーブル８の‘京都’とｃ２で指定
される欄に、検索済み(1)を設定する。The position verification unit 7 retrieves the search history relating to the recognition time position of the word code from the position history table 8 by the word code sent from the word selection unit 2, and uses the recognition time position sent from the word selection table 1. Tests if the search for was previously done. The position history table 8 shown in FIG. 2 is an example in which the position at the time of recognition is up to the second character, and for each word code, the processed positions (1)
It has a field that can record whether it is unprocessed (0).
As an example of the position verification process using the position history table 8, the position history table 8 is in a state where the search history obtained by performing the position verification is recorded as shown in FIG. The case where the code is "Tokyo", the evaluation value is 0.8, and the recognition position is c1 is sent. The position history table 8 determines that the word “Tokyo” is duplicated from the word code “Tokyo”, and the recognition time position c1 has been processed (1).
It is obtained that 2 is unprocessed (0). From this,
In the word code “Tokyo”, it can be detected that the recognition-time position c1 sent from the selection processing unit 2 has been processed, so the word code “Tokyo” is not sent to the determination value accumulating unit 4. Further, in the example in which the position test is performed, for the word code'Kyoto ', it is detected that the recognition time position c2 sent from the selection processing unit 2 is unprocessed, and the word code'Kyoto' is evaluated together with the evaluation value. It is sent to the judgment value accumulating unit 4.
At this time, searched (1) is set in the column designated by “Kyoto” and c2 in the position history table 8.

判定値累積部４では、位置検定部７で出力された単語デ
ータの中で、同じ単語コードについて、評価値を累積加
算する。前述の例では、単語コード‘東京’は、第２図
のの場合にしか送られてこないため、評価値は0.9と
なり、認識候補文字から得られる単語の評価値として正
しい値が与えられたことになる。この後、ソート処理部
５において、評価値の大小比較が行われる。第２図の例
では、この単語認識装置は、‘京都’を第１位候補とし
て出力する。The judgment value accumulating unit 4 cumulatively adds the evaluation values for the same word code in the word data output by the position verification unit 7. In the above example, the word code "Tokyo" was sent only in the case of Fig. 2, so the evaluation value was 0.9, and the correct value was given as the evaluation value of the word obtained from the recognition candidate characters. become. After that, the sort processing unit 5 compares the evaluation values. In the example of FIG. 2, this word recognition device outputs “Kyoto” as the first candidate.

第３図は、本発明における拡張時の位置履歴テーブルの
図である。FIG. 3 is a diagram of a position history table at the time of expansion in the present invention.

すなわち、第３図は、認識時位置をＮ文字目に拡張した
ときの位置履歴テーブルが示されている。位置履歴テー
ブル８は、単語コードＷごと、および任意の各認識時位
置Ｃごとに、処理済み(1)か、未処理(0)かを記録する欄
で構成される。位置履歴テーブル８の検索時には、単語
コードと現在処理している認識時位置を用いて１／０を
得ることができる。このために、認識時位置Ｃｉにある
候補文字の文字コードＣＤｊを入力することにより、選
択処理部２で単語コードＷｋが選択された時、位置履歴
テーブル８のＷｋ行Ｃｉ列の値を読み出して、その単語
コードを判定値累積部４に出力するか否を検定する。検
定の結果、未処理であった場合に、Ｗｋ行Ｃｉ列に検索
済み(1)を記録して、全ての認識時位置Ｃ１〜Ｃｎで処
理済みとなったことを確認し、単語コードと評価値を判
定値累積部４に送る。また、入力された単語コードがそ
の表から既に処理済みであった場合には、その単語コー
ドについては判定値累積部４に出力しない。That is, FIG. 3 shows the position history table when the recognized position is expanded to the Nth character. The position history table 8 is composed of columns for recording processed (1) or unprocessed (0) for each word code W and each arbitrary recognition-time position C. When searching the position history table 8, 1/0 can be obtained by using the word code and the currently-recognized position at the time of recognition. Therefore, when the word code Wk is selected by the selection processing unit 2 by inputting the character code CDj of the candidate character at the recognition-time position Ci, the value in the Wk row and Ci column of the position history table 8 is read out. , It is verified whether or not the word code is output to the judgment value accumulating unit 4. If it is unprocessed as a result of the test, the search completed (1) is recorded in the Wk row and the Ci column, and it is confirmed that all the recognition positions C1 to Cn have been processed, and the word code and evaluation are performed. The value is sent to the judgment value accumulating unit 4. If the input word code has already been processed from the table, the word code is not output to the judgment value accumulating unit 4.

第１図の構成について、さらに詳述する。The configuration of FIG. 1 will be described in more detail.

第１図の構成では、入力文字列の各認識時位置における
各候補文字、認識時位置、評価値の組を入力し、最も確
実らしい単語を出力する。各認識位置の候補文字コード
ごとに、探索回路１１によりメモリ中の単語選択テーブ
ル１を用いて、１つ以上の辞書内位置につき文字コード
に対応する単語コードを検索し、検索された単語コード
を判定回路１２に送出する。判定回路１２では、メモリ
中の位置履歴テーブル８を参照し、その単語コードが重
複検索されていないか否かを検定する。重複検索されて
いない場合には、位置履歴テーブル８に検索済みを記録
し、加算回路１３に単語コードと評価値を送る。加算回
路１３では、メモリ中の評価値テーブル３からその単語
コードの現在までの判定値を検出し、その評価値の加算
によりその単語コード判定値を更新し、評価値テーブル
３にその値を戻す。全認識時位置の全辞書内位置による
処理が終了した後、ソート回路９により、判定値テーブ
ル６（図示省略）のソートを行い、その結果を出力す
る。すなわち、単語コードを判定値とを出力する。In the configuration of FIG. 1, each candidate character at each recognition position of the input character string, a recognition position, and a set of evaluation values are input, and the most certain word is output. For each candidate character code of each recognition position, the search circuit 11 uses the word selection table 1 in the memory to search for a word code corresponding to the character code for one or more positions in the dictionary, and to find the searched word code. It is sent to the judgment circuit 12. The judgment circuit 12 refers to the position history table 8 in the memory and verifies whether or not the word code is duplicately searched. When the duplicate search is not performed, the search result is recorded in the position history table 8 and the word code and the evaluation value are sent to the adding circuit 13. The adder circuit 13 detects the judgment value up to the present of the word code from the evaluation value table 3 in the memory, updates the word code judgment value by adding the evaluation value, and returns the value to the evaluation value table 3. . After the processing of all the positions at the time of recognition in all the dictionaries is completed, the sorting circuit 9 sorts the determination value table 6 (not shown) and outputs the result. That is, the word code and the determination value are output.

第４図は、本発明の他の実施例を示す単語照合装置のブ
ロック構成図であり、第５図は第４図で用いられる位置
履歴テーブルの図である。FIG. 4 is a block configuration diagram of a word matching device showing another embodiment of the present invention, and FIG. 5 is a diagram of a position history table used in FIG.

第４図では、並列に照合する場合の構成例が示されてい
る。FIG. 4 shows a configuration example in the case of collating in parallel.

第４図においては、選択処理・位置検定部１０ａ〜１０
ｎの探索回路１１ａ〜１１ｎおよび判定回路１２ａ〜１
２ｎを用いて、各辞書内位置ごとに並列に実行し、辞書
内位置ごとの位置履歴テーブル８ａ〜８ｎを作成する。
この後、加算回路１３において判定値を計算し、ソート
回路９により判定値の大小比較を行い、その結果を出力
する。このとき、位置履歴テーブル８および８ａ〜８ｎ
には、第５図(a)(b)(c)に示すように、単語コードの各
認識時位置の欄に１／０のフラグではなく、評価値を記
録する。In FIG. 4, the selection processing / position verification units 10a-10
n search circuits 11a-11n and determination circuits 12a-1
2n is used and executed in parallel for each in-dictionary position to create position history tables 8a to 8n for each in-dictionary position.
After that, the adder circuit 13 calculates a judgment value, the sorting circuit 9 compares the judgment values, and outputs the result. At this time, the position history tables 8 and 8a to 8n
As shown in FIGS. 5A, 5B, and 5C, the evaluation value is recorded in the column of each recognition time position of the word code, instead of the 1/0 flag.

各辞書内位置ごとの処理が終了した後、比較回路１４に
おいて各位置履歴テーブル８ａ〜８ｎを検査し、重視し
ている場合には評価値の比較を行い、高い評価値を優先
することにより、重複検索による誤評価を回避する。こ
れにより、認識時の得点が高い候補文字での照合が可能
となる。After the processing for each position in each dictionary is completed, the position history tables 8a to 8n are inspected in the comparison circuit 14, and when the priority is given, the evaluation values are compared, and the high evaluation value is prioritized. Avoid false evaluation due to duplicate search. As a result, it becomes possible to perform collation with a candidate character having a high score at the time of recognition.

第５図においては、辞書内位置１文字目の単語コードＷ
１の認識時位置１文字目と、辞書内位置Ｍ文字目の単語
コードＷ１の認識時位置１文字目が重複している。この
ため、双方を比較して、得点が高い0.8を辞書内位置１
文字目の得点をＷ１行１列目の得点とする。In FIG. 5, the word code W of the first character in the dictionary position
The first character at the recognition position of 1 and the first character at the recognition position of the word code W1 of the Mth character in the dictionary overlap. Therefore, comparing both, 0.8 with a high score is the position 1 in the dictionary.
The score of the character is the score of the W1 row and the 1st column.

なお、１箇所の認識時位置について１箇所以上の辞書内
位置との文字の照合は、任意ピッチの手書き文字や連続
音声のように、文字または音節の句切りが未知であるた
めに生じる認識時位置と辞書内位置のずれを補正する場
合にも必要となる。It should be noted that character recognition at one or more recognition positions with one or more positions in the dictionary occurs when recognition occurs because character or syllable punctuation is unknown, such as handwritten characters at arbitrary pitches or continuous speech. It is also necessary when correcting the deviation between the position and the position in the dictionary.

このように、本発明においては、単語コードごとに認識
時位置の検索履歴を記録した位置履歴テーブルを設け
て、単語コードと単語コードを選択した文字コードの認
識時位置を入力することにより、その単語コードの認識
時位置についての未検索である単語コード、評価値を出
力する位置検定部を用いて重複検索がないか否かを検定
する。これにより、１つの認識時位置に複数の候補文字
があり、それらの候補文字の組合わせによりできる単語
があるために、１つの認識位置について１つ以上の辞書
内位置で文字の照合を行う際に、誤って評価値を出力す
る場合に対しても、位置検出部で単語コードと認識時位
置から位置履歴テーブルにより検索履歴を参照するの
で、誤評価を防止することができる。また、並列処理を
行う場合でも、各プロセッサごとに位置履歴テーブルを
持たせて、位置履歴テーブルに記録する値を各認識時位
置の文字候補の評価値とすることにより、並列化しても
高精度に単語を認識することができる。Thus, in the present invention, by providing a position history table recording the search history of the recognition time position for each word code, by inputting the recognition time position of the word code and the character code selected word code, Whether or not there is a duplicate search is tested using a position test unit that outputs the unsearched word code and the evaluation value for the recognized position of the word code. As a result, when there is a plurality of candidate characters in one recognition position and there is a word formed by combining the candidate characters, when collating characters at one or more dictionary positions for one recognition position. Further, even when the evaluation value is erroneously output, the position detection unit refers to the search history from the position history table based on the word code and the position at the time of recognition, so that erroneous evaluation can be prevented. Even when performing parallel processing, each processor has its own position history table, and the value recorded in the position history table is used as the evaluation value of the character candidate at each recognition position. Can recognize words.

〔The invention's effect〕

以上説明したように、本発明によれば、入力文字列の各
文字位置に１コード以上の文字候補があり、かつ入力文
字列中に誤字、脱字、余剰字がある場合でも、高精度に
単語を認識することができる。また、並列処理を行う場
合においても、位置履歴テーブルに記録する値を文字候
補の評価値とすることにより、高精度に単語を認識する
ことができる。As described above, according to the present invention, even if there is a character candidate of one code or more at each character position of the input character string, and there is a typographical error, a missing character, or a surplus character in the input character string, the word can be accurately generated. Can be recognized. Further, even in the case of performing the parallel processing, the word can be recognized with high accuracy by using the value recorded in the position history table as the evaluation value of the character candidate.

【図面の簡単な説明】第１図は本発明の一実施例を示す単語照合装置のブロッ
ク構成図、第２図は第１図を用いて認識処理を行う場合
の過程図、第３図は第１図における位置履歴テーブルの
内容を示す図、第４図は本発明の他の実施例を示す並列
処理の単語照合装置のブロック図、第５図は第４図の装
置に用いられる位置履歴テーブルの構成例を示す図、第
６図は従来の単語認識装置のブロック図、第７図は第６
図の装置の動作過程を示す図である。１：単語選択テーブル、２：選択処理部、３：判定値テ
ーブル、４：判定値累積部、５：ソート処理部、６：位
置履歴テーブル、７：位置検定部、８，８ａ〜８ｎ：位
置履歴テーブル、９：ソート回路、１０ａ〜１０ｍ：選
択処理・位置検定部、１１：探索回路、１２：判定回
路、１３：加算回路、１４：比較回路。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block configuration diagram of a word collating device showing an embodiment of the present invention, FIG. 2 is a process diagram when a recognition process is performed using FIG. 1, and FIG. 1 is a diagram showing the contents of the position history table in FIG. 1, FIG. 4 is a block diagram of a word collating device for parallel processing showing another embodiment of the present invention, and FIG. 5 is a position history used in the device of FIG. FIG. 6 is a block diagram of a conventional word recognition device, and FIG. 7 is a diagram showing a table configuration example.
It is a figure which shows the operation process of the apparatus of the figure. 1: word selection table, 2: selection processing unit, 3: determination value table, 4: determination value accumulation unit, 5: sorting processing unit, 6: position history table, 7: position verification unit, 8, 8a to 8n: position History table, 9: sort circuit, 10a to 10m: selection processing / position verification unit, 11: search circuit, 12: determination circuit, 13: addition circuit, 14: comparison circuit.

Claims

[Claims]

1. A word collation method for collating which word of preset words an input character string in which character codes are arranged corresponds to each position (hereinafter, Each word in the dictionary) includes a word selection table in which the used characters correspond to words, and the character code of the candidate character at an arbitrary character position in the input character string (hereinafter, referred to as the recognition position) and the dictionary The word code and the word code include a selection processing means for inputting the position and outputting a corresponding word code, and a position history table recording the recognition time position of the character code used for selecting the word for each word code. The position verification means for inputting the recognition time position of the character code used for the selection, and outputting the search history of whether the word code recognition time has been searched with other candidate characters, and the position verification means. For each word code output from the column, a judgment value accumulating means for adding an evaluation value indicating the certainty, and a sorting means for extracting candidate words based on the magnitude relation of the judgment values of the judgment value accumulating means, For the candidate character for each recognition position of the input character string by the selection processing means, 1
A word code corresponding to a character code is selected from the word selection table for each position in the dictionary at more than one place, and the position history detecting unit reads the search history of the word code and the recognition-time position from the position history table. Only when it is a search, the searched history is set in the position history table, the word code is sent to the judgment value accumulating means, and in the judgment value accumulating means, the word code output from the position verification means. Is added cumulatively to obtain a judgment value for the input character string, and a candidate word is extracted by the sorting means.

2. As a search history of the position history table,
Claim 1 characterized by writing a measure of certainty obtained from the recognition result of each character of the input character string.
The word matching method described in the section.