JPH0766274B2

JPH0766274B2 - Word speech recognizer

Info

Publication number: JPH0766274B2
Application number: JP62132903A
Authority: JP
Inventors: 隆夫渡辺
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1987-05-27
Filing date: 1987-05-27
Publication date: 1995-07-19
Anticipated expiration: 2010-07-19
Also published as: JPS63294600A

Description

【発明の詳細な説明】（産業上の利用分野）本発明は音声認識技術、特に単語音声認識の技術の改良
に関する。Description: TECHNICAL FIELD The present invention relates to a speech recognition technique, and more particularly to improvement of a word speech recognition technique.

（従来の技術とその問題点）単語音声認識装置では、利用者のアプリケーシヨンに従
つて認識語いを設定し、語いの各単語を標準パタンとし
て登録してこれを用いて認識を行つているところが、従
来の単語音声認識装置は、設定する語いの如何によつて
は、誤認識を起こしやすい。これは、認識対象となる単
語の中に互いに類似している単語が存在することがある
からである。(Prior art and its problems) With a word voice recognition device, a recognition word is set according to the application of the user, each word of the word is registered as a standard pattern, and recognition is performed using this. However, the conventional word voice recognition device is prone to erroneous recognition depending on the set word. This is because there may be words that are similar to each other among the words to be recognized.

本発明は、利用者が使いたい単語名を、利用者に代替可
能な複数の単語群として設定してもらい、これらの単語
群の中で最も誤認識を起こしにくい組合せを自動的に決
定する手段を装置の中に組み入れることにより、認識誤
りの起こりにくい音声認識装置を実現することを目的と
している。The present invention is a means for allowing a user to set a word name that a user wants to use as a plurality of word groups that can be substituted, and for automatically determining a combination that is most unlikely to cause misrecognition in these word groups. The object of the present invention is to realize a voice recognition device in which recognition errors are unlikely to occur by incorporating the above into the device.

（問題点を解決するための手段）また、本願の発明は、認識対象となる語いを設定するに
際して、対象となる単語機能の各各について互いに代替
可能な複数の単語からなる単語群（代替単語群）を定義
し、各代替単語群の中から一つの単語を選択することに
よつて語いを決定する手段を含んでなり、この語い決定
手段における前記単語の選択は代替単語群中の単語の音
形記述により算出された各単語間距離を用いて行われる
ことを特徴とする。(Means for Solving Problems) Further, according to the invention of the present application, when setting a vocabulary to be recognized, a word group consisting of a plurality of words that can be mutually replaced for each of the target word functions (alternative (Group of words) and determining a vocabulary by selecting one word from each of the alternative word groups. Is performed by using the distance between words calculated by the phonetic description of the word.

（作用）本発明の基本的な原理を以下に説明する。利用者の必要
とする単語の種類をＫとする。ここではこれをＫ種類の
カテゴリがあると呼ぶことにする。利用者は各単語の代
替として可能な単語名を与える。即ち、利用者は各カテ
ゴリ毎に可能な単語の群を与える。一方、単語の間での
誤認識の起こりやすさを表すものとして、単語間での距
離Ｄ（X,Y）を考える。ここでX,Yは単語を表す。(Operation) The basic principle of the present invention will be described below. Let K be the type of word required by the user. Here, this will be called that there are K types of categories. The user gives a possible word name as a substitute for each word. That is, the user gives a group of possible words for each category. On the other hand, the distance D (X, Y) between words is considered as a measure of the likelihood of misrecognition between words. Here, X and Y represent words.

問題は、各カテゴリに属する単語群の内から、単語を選
んだとき、選択された単語相互間の距離がなるべく大き
くなるように、単語を選択することである。誤認識の点
からは、単語間距離の小さい単語対の存在を避けること
が必要である。このためには、選択された各単語間の距
離の最小値が最大となるように、単語を選択を行えばよ
い。選択の組合せが少ないときはすべての組合せについ
て単語間距離を計算すればよいが、語いが増えるに従つ
て処理量も著しく増加する。ここでは、最適ではない
が、単語間距離をできるだけ大きく保つより効果的な次
の方法について説明する。The problem is to select the words from the group of words belonging to each category so that the distance between the selected words becomes as large as possible. In terms of misrecognition, it is necessary to avoid the presence of word pairs with small interword distances. For this purpose, the words may be selected so that the minimum value of the distances between the selected words becomes the maximum. When the number of selected combinations is small, the inter-word distance may be calculated for all the combinations, but the processing amount increases remarkably as the number of words increases. Here, the following method, which is not optimal but more effective than keeping the distance between words as large as possible, will be described.

基本的な手順は、すべてカテゴリの代替単語群の中か
ら、他の単語との単語間距離の小さい単語を順次削除
し、最終的に各カテゴリ１ケの単語を残すというもので
ある。具体的には次のようになる。The basic procedure is to sequentially delete words having a small interword distance from other words from the alternative word group of each category, and finally leave one word for each category. Specifically, it is as follows.

（１）全カテゴリの代替単語群中の各単語Ｗについて
自分の属するカテゴリ以外のすべての単語（但し、既に
削除されている単語を除く）との距離を計算し、これら
の距離の最小値をｄ（ｗ）とする。(1) For each word W in the alternative word group of all categories, calculate the distance from all the words other than the category to which it belongs (excluding the words that have already been deleted), and calculate the minimum value of these distances. Let d (w).

（２）得られたｄ（ｗ）のうちで最小値を与えるＷ
（＝ｗ^＊）を選択し、これを削除する。但し、ｗ^＊がそ
のカテゴリに属する唯一の単語であるときには、ｗ^＊の
かわりに、２番目に小さい値を与えるｗを選択する。(2) W giving the minimum value of the obtained d (w)
Select (= w ^* ) and delete it. However, when w ^* is the only word belonging to that category, w is selected instead of w ^* , which gives the second smallest value.

Ｄ（X,Y）を決定する方法として、ここでは次の方法を
示す。The following method is shown here as a method of determining D (X, Y).

単語の音形記述、すなわち、単語を音素（ないし音節）
の列としたものを使う方法である。音素（ないし音節）
間の距離をあらかじめ音声学の知識に基いて定義してお
けば、通常のDPマツチングの方法においてパタンのベツ
トル間距離のかわりにこの音素間距離を用いることによ
つて、単語間距離が算出される。すなわち比較する２つ
の単語の音形記述をそれぞれＡ＝｛a₁…ａ_Ｍ｝、Ｂ＝
｛b₁…ｂ_Ｎ｝とし、音素X,Y間距離をｄ（x,y）とする
と、単語間距離Ｄ（A,B）は次の漸化式により計算する
ことができる。Phonetic description of a word, that is, a phoneme (or syllable) of a word
This is a method of using the columns of. Phonemes (or syllables)
If the distance between words is defined in advance based on the knowledge of phonetics, the distance between words can be calculated by using this distance between phonemes instead of the distance between patterns in the normal DP matching method. It That is, the phonetic descriptions of the two words to be compared are A = {a ₁ ... A _M }, B =
If {b ₁ ... b _N } and the phoneme X, Y distance is d (x, y), the word distance D (A, B) can be calculated by the following recurrence formula.

初期条件ｇ（m,n）＝0,m＝0,n＝０のとき＝∽（ｍ＞0,n
＝０）又は（ｍ＝0,u＞０）のときｍ＝1,…,M;n＝1,…,N Ｄ（A,B）＝ｇ（M,N）／（Ｍ＋Ｎ）（実施例）第１図は本発明を実現した装置の一実施例を示すブロツ
ク図である。参照数字１はパタンバツフアであり各カテ
ゴリに属する単語群の音形記述が格納される。参照数字
２はパタン間距離計算部でありパタンバツフア１内の各
カテゴリｋに属する単語群Ｗ^Ｋ（１）…Ｗ^Ｋ（Ｊ
（Ｋ））のそれぞれの間の距離の群｛ｄ（j₁,j₂）｝が
計算され、パタン間距離バツフア３に格納される。但し
ここでｄ（j₁,j₂）は単語Ｗ^Ｋ（j₁）とＷ^Ｋ（j₂）の間
のパタン間距離であり、前述の方法により算出されるも
のである。参照数字４はパタン選択部であり、上記の距
離バツフア３の内容を読み出し、前述した単語を順次削
除する手順を実行し、最終的に標準パタンとして使用す
る単語を選択し選択結果を出力する。認識部５は、上記
により選択された単語について標準パタン登録を行い、
これを用いて単語認識を行う。ここで、単語認識の実現
形態としては標準パタンとのマツチングによるものであ
ればいかなるものでも使用可能である。When initial condition g (m, n) = 0, m = 0, n = 0 = ∽ (m> 0, n
= 0) or (m = 0, u> 0) m = 1, ..., M; n = 1, ..., N D (A, B) = g (M, N) / (M + N) (Embodiment) FIG. 1 shows an embodiment of an apparatus realizing the present invention. It is a block diagram showing. Reference numeral 1 is a pattern and stores the phonetic description of a word group belonging to each category. Reference numeral 2 is an inter-pattern distance calculation unit, which is a word group W ^K (1) ... W ^K (J belonging to each category k in the pattern buffer 1
A group {d (j ₁ , j ₂ )} of distances between (K)) is calculated and stored in the inter-pattern distance buffer 3. Here, d (j ₁ , j ₂ ) is the inter-pattern distance between the words W ^K (j ₁ ) and W ^K (j ₂ ) and is calculated by the method described above. Reference numeral 4 is a pattern selection unit, which reads the contents of the distance buffer 3 described above, executes the above-described procedure for sequentially deleting words, and finally selects a word to be used as a standard pattern and outputs the selection result. The recognition unit 5 performs standard pattern registration for the word selected above,
Word recognition is performed using this. Any word recognition can be used as long as it is based on matching with a standard pattern.

（発明の効果）以上述べたように本発明によれば、複数の単語群の中で
最も誤認識を起こしにくい組合せを自動的に決定する手
段を装置の中に組み入れることが可能となり、認識精度
の高い音声認識装置を実現できる。(Effects of the Invention) As described above, according to the present invention, it is possible to incorporate a means for automatically determining a combination that is most unlikely to cause misrecognition in a plurality of word groups into the device, and to improve recognition accuracy. It is possible to realize a high-quality voice recognition device.

[Brief description of drawings]

第１図は本発明による一実施例を示すブロツク図であ
り、図中、１……パタンバツフア、２……距離計算部、
３……距離バツフア、４……選択部、５……認識部であ
る。FIG. 1 is a block diagram showing an embodiment according to the present invention, in which 1 ... Pattern buffer, 2 ... distance calculator,
3 ... Distance buffer, 4 ... selection unit, 5 ... recognition unit.

Claims

[Claims]

1. When setting a vocabulary to be recognized,
Includes a function that defines a word group consisting of multiple words that can be substituted for each target word function as an alternative word group, and selects a word from each alternative word group The word speech recognition apparatus characterized in that the word selection in the vocabulary determination means is performed by using the inter-word distances calculated by the phonetic description of the words in the alternative word group.