JPH067349B2

JPH067349B2 - Voice recognition system

Info

Publication number: JPH067349B2
Application number: JP61078823A
Authority: JP
Inventors: 文雄外川; 徹上田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1986-04-05
Filing date: 1986-04-05
Publication date: 1994-01-26
Anticipated expiration: 2009-01-26
Also published as: JPS62235992A

Description

【発明の詳細な説明】産業上の利用分野本発明は、入力れたた音声を音節単位に認識する日本語
音声認識装置などに有利に用いられる音声認識方式に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recognition method that is advantageously used in a Japanese voice recognition device that recognizes input voice in syllable units.

背景技術入力された音声を音節単位に認識する日本語音声認識入
力装置などに用いられる従来からの音声認識方式では、
連続音声を音節に分解して切り出された音節を主として
パターンマッチングによって認識する場合、発声のばら
つきや調音結合と呼ばれる前後の音の影響による変形な
どによる音節認識率の低下を、各音節に複数個の特徴標
準パターンを持たせることで防いでいる。また、それら
の特徴標準パターンを入力された新しいパターンで更新
することにより認識率を向上させている。BACKGROUND ART In the conventional voice recognition method used in a Japanese voice recognition input device for recognizing input voice in syllable units,
When recognizing a syllable that is obtained by decomposing continuous speech into syllables mainly by pattern matching, multiple syllable recognition rates are reduced for each syllable due to variations in vocalization or deformation due to the influence of preceding and following sounds called articulatory coupling. This is prevented by having a standard pattern. In addition, the recognition rate is improved by updating these characteristic standard patterns with new input patterns.

発明が解決しようとする問題点上記先行技術では、各音節の標準パターン数が設定され
ておらず、したがって入力される音節の頻度に応じて学
習が進行していくと、各音節のパターン数に偏りが生じ
て、入力頻度の低い音節の認識が極端に悪くなる。Problems to be Solved by the Invention In the above-mentioned prior art, the standard pattern number of each syllable is not set. Therefore, when learning progresses according to the frequency of the syllable to be input, the number of patterns of each syllable changes. Bias occurs and recognition of syllables with low input frequency becomes extremely poor.

本発明の目的は、上述の技術的課題を解決し、特徴標準
パターンの音節が占めるパターン配分が入力頻度に依存
して偏ることの無いようにして、認識率の変動を可及的
に小さくし、認識精度が向上するようにした音声認識方
式を提供することである。An object of the present invention is to solve the above-mentioned technical problem, to prevent the pattern distribution occupied by syllables of the characteristic standard pattern from being biased depending on the input frequency, and to minimize the fluctuation of the recognition rate. , And to provide a voice recognition method with improved recognition accuracy.

問題点を解決するための手段本発明は、入力された音声を予め登録された複数種類の
音節の特徴標準パターンとの類似度計算によって類似度
の高い順に特徴標準パターンの候補を順位づけて音節単
位に認識し、その結果を辞書との照合もしくはキーボー
ドなどの外部指示操作によって修正して最終的な入力を
得るようにした音声認識方式において、候補順位と修正の有無に依存する認識貢献度を演算して
求め、音節毎に特徴標準パターンの最大パターン数を設定し、認識に最小限必要な特徴標準パターンを初期に登録した
後、入力時に各音節の前記最大パターン数を越えない範
囲で、入力された特徴パターンを自動登録した特徴標準
パターンを更新するようにし、この更新にあたっては、
認識貢献度が最も低いものを優先して更新し、同一音節名に属する標準特徴パターンの各認識貢献度の
値の平均値が認識貢献度の値の中央値になるように、そ
の値をシフトして、累積される認識貢献度を正規化する
ことを特徴とする音声認識方式である。Means for Solving Problems According to the present invention, by calculating the degree of similarity between the input voice and the characteristic standard patterns of a plurality of types of syllables registered in advance, the characteristic standard pattern candidates are ranked in descending order of similarity. In the voice recognition method, which recognizes each unit and corrects the result by collating with a dictionary or external instruction operation such as a keyboard to obtain the final input, the recognition contribution rate depending on the candidate rank and correction Calculated, set the maximum number of feature standard patterns for each syllable, register the minimum feature standard pattern required for recognition in the initial stage, and then input within a range that does not exceed the maximum number of patterns for each syllable. The feature standard pattern that automatically registers the input feature pattern is updated.
The one with the lowest recognition contribution is preferentially updated, and that value is shifted so that the average value of the recognition contributions of the standard feature patterns belonging to the same syllable name becomes the median of the recognition contributions. Then, the speech recognition method is characterized by normalizing the accumulated recognition contribution.

作用本発明に従えば、調音結合による前後の音韻からの影響
を受けやすく統計的に認識率の悪い音節で、入力の頻度
が高い音節（たとえば単母音、有声破裂音、ラ行音な
ど）は最大パターン数を高い値に設定し、逆の場合に当
てはまる音節（たとてば「しゃ」などのよう音からなる
摩擦音、「ぴゃ」などの破裂音）は低い値を設定して効
率良く特徴標準パターンを構成し、各音節に固有に与え
られた数を越えない範囲で入力時の自動登録によって特
徴標準パターンの追加・更新・削除が行なわれる。Effects According to the present invention, syllables that are easily affected by the phonemes before and after the articulatory combination and have a statistically poor recognition rate, and syllables that are frequently input (for example, single vowels, voiced plosives, and ra-songs) are Set the maximum number of patterns to a high value, and set the low value for the syllables that apply in the opposite case (frictional sounds such as fricative sounds such as "sha" and plosive sounds such as "ya") efficiently Characteristic standard patterns are added / updated / deleted by automatic registration at the time of input that constitutes a pattern and does not exceed the number given to each syllable.

これによって特徴標準パターンの音節が占めるパターン
配分が入力頻度によって偏ることなく、したがって認識
率の変動を可及的に小さく抑えることができ、安定した
音声認識状態を得ることができる。As a result, the pattern distribution occupied by the syllables of the characteristic standard pattern is not biased by the input frequency, and thus the fluctuation of the recognition rate can be suppressed as small as possible and a stable voice recognition state can be obtained.

また本発明に従えば、認識貢献度を、特徴標準パターン
の候補順位と修正の有無に依存して演算して求め、更新
にあたっては、認識貢献度が最も低いものを優先して更
新するようにし、これによって特徴標準パターンの最適
化を容易に素早く行うことができるようになる。Further, according to the present invention, the recognition contribution is obtained by calculating depending on the candidate rank of the characteristic standard pattern and the presence or absence of correction, and when updating, the one with the lowest recognition contribution is preferentially updated. As a result, the feature standard pattern can be optimized easily and quickly.

さらに本発明に従えば、累積される認識貢献度を正規化
するようにし、これによって認識貢献度の値のバランス
化を図り、メモリの有効な利用を図るようにする。Further, according to the present invention, the accumulated recognition contributions are normalized so that the values of the recognition contributions are balanced and the memory is effectively used.

実施例第１図は本発明の一実施例の日本語音声入力装置１の構
成を示すブロック図であり、第２図は日本語音声入力装
置１における音声認識処理の手順を示すフローチャート
である。この日本語音声入力装置１は、連続的に発声さ
れた音声を音節単位に認識し、この認識結果を辞書によ
って修正した後、単語などの単位で外部装置に転送する
機能を有している。Embodiment FIG. 1 is a block diagram showing the configuration of a Japanese voice input device 1 according to an embodiment of the present invention, and FIG. 2 is a flow chart showing the procedure of voice recognition processing in the Japanese voice input device 1. The Japanese voice input device 1 has a function of recognizing continuously uttered voices in syllable units, correcting the recognition result by a dictionary, and then transferring the recognition results to an external device in units of words or the like.

先ず、ステップn1からステップn2に移り、音声信号が入
力される。すなわち発声され入力された音声は、マイク
ロホン２を介してアナログ入力部３に入力され、このア
ナログ入力部３内の増幅器４によって増幅された後、ア
ナログ／デジタル変換器５によってデジタル信号に変換
され、そのデジタル信号が音声分析部６および音節セグ
メンテーション部７に入力れる。First, the process moves from step n1 to step n2, and a voice signal is input. That is, the voice that is uttered and input is input to the analog input unit 3 via the microphone 2, amplified by the amplifier 4 in the analog input unit 3, and then converted into a digital signal by the analog / digital converter 5. The digital signal is input to the voice analysis unit 6 and the syllable segmentation unit 7.

次にステップn2からステップn3に移り、音響処理および
音節の切り出し処理が行なわれる。すなわち音声分析部
６では、入力音声を１６ms程度のフレームに分けスペク
トル分析を行ない、８ms程度の間隔で音節セグメンテー
ション部７に音節のセグメンテーションに必要な特徴パ
ラメータを転送する。音節セグメンテーション部７で
は、音声分析部６からの種々の特徴パラメータをリング
状の特徴パターンバッファ８に一時記憶しながら音節を
切り出して各音節の特徴をパターン化して特徴パターン
メモリ９に記憶する。特徴パターンバッファ８は複数の
音節を記憶することができるように構成されている。音
節セグメンテーション部７の処理は、中央処理装置（以
下ＣＰＵと言う）１０からの命令により開始・停止が制
御されるように構成されている。Next, the process proceeds from step n2 to step n3, and acoustic processing and syllable segmentation processing are performed. That is, the voice analysis unit 6 divides the input voice into frames of about 16 ms, performs spectrum analysis, and transfers the characteristic parameters necessary for syllable segmentation to the syllable segmentation unit 7 at intervals of about 8 ms. The syllable segmentation unit 7 cuts out syllables while temporarily storing various feature parameters from the voice analysis unit 6 in the ring-shaped feature pattern buffer 8 to pattern the features of each syllable and stores them in the feature pattern memory 9. The characteristic pattern buffer 8 is configured to be able to store a plurality of syllables. The processing of the syllable segmentation unit 7 is configured such that start / stop is controlled by an instruction from a central processing unit (hereinafter referred to as CPU) 10.

次にステップn4で音節の認識処理が行なわれ、ステップ
n5では認識結果候補が選ばれる。すなわち、音節認識部
１１では、各音節の特徴パターンと、特徴標準パターン
メモリ１２に予め記憶されているすべての特徴標準パタ
ーンとのパターン距離計算を行ない、類似頻度の高い順
に候補を出す。同種の音節名を持つ候補を統合し音声認
識結果として認識結果メモリ１３に記憶する。Next, in step n4, syllable recognition processing is performed.
At n5, recognition result candidates are selected. That is, the syllable recognition unit 11 calculates the pattern distance between the characteristic pattern of each syllable and all the characteristic standard patterns stored in the characteristic standard pattern memory 12 in advance, and outputs candidates in descending order of similarity frequency. The candidates having the same kind of syllable name are integrated and stored in the recognition result memory 13 as a voice recognition result.

次にステップn6で音節認識結果の誤りを修正処理し、ス
テップn7で確定された音節認識結果を導き出す。すなわ
ち音節認識部１１内の修正処理部１１ａでは、言語処理
用辞書メモリ１４に記憶した辞書を用いて音節認識結果
の誤りを自動的に修正する。あるいは操作者自身がキー
ボード１５によって、入力音声に対する認識候補から正
しい候補を選択したり、また誤り箇所を直接修正する場
合もある。このようにして確定された正しい結果は、漢
字に変換されて文字列として出力される。Next, in step n6, the error in the syllable recognition result is corrected, and the syllable recognition result determined in step n7 is derived. That is, the correction processing unit 11a in the syllable recognition unit 11 automatically corrects an error in the syllable recognition result using the dictionary stored in the language processing dictionary memory 14. Alternatively, the operator himself may use the keyboard 15 to select a correct candidate from the recognition candidates for the input voice, or may directly correct the error portion. The correct result determined in this way is converted into Kanji and output as a character string.

次にステップn8で学習処理が行なわれ、ステップn9で処
理が終了する。すなわち音節認識部１１内の学習制御部
１１ｂでは、確定文字列により認識結果を分析して更新
される学習情報、具体的には認識貢献度カウンタ１６の
カウント値と、メモリ１７に記憶されている最近の認識
情況とに基づいて、入力された各音節の特徴パターンを
学習するか否かを判定する。学習すると判定された音節
については、入力された音節の特徴パターンを特徴標準
パターンとして追加するか、あるいは入力音節と同一音
節名の特徴標準パターンの中で最も悪いパターンすなわ
ち認識貢献度カウンタ１６のカウント値が最小である認
識貢献度が最も低いパターンをその音節の特徴パターン
で置換するか、その特徴パターンを用いて平均化の操作
を行なうなどしてその特徴標準パターンを更新する。ま
た、認識貢献度が極端に低い特徴標準パターンは削除す
る場合もある。Next, the learning process is performed in step n8, and the process ends in step n9. That is, in the learning control unit 11 b in the syllable recognition unit 11, learning information updated by analyzing the recognition result by the confirmed character string, specifically, the count value of the recognition contribution counter 16 and the memory 17, are stored in the memory 17. Based on the recent recognition situation, it is determined whether or not to learn the characteristic pattern of each input syllable. For the syllables determined to be learned, the characteristic pattern of the input syllable is added as a characteristic standard pattern, or the worst pattern among the characteristic standard patterns having the same syllable name as the input syllable, that is, the count of the recognition contribution counter 16 is counted. The feature standard pattern is updated by replacing the pattern having the smallest recognition contribution with the lowest recognition contribution with the feature pattern of the syllable, or performing an averaging operation using the feature pattern. In addition, a feature standard pattern having an extremely low recognition contribution may be deleted.

なお、各音節の最大標準パターン数は、音節の最大パタ
ーン数テーブル１８に記憶されている。また音声分析部
６以外は、すべてＣＰＵ１０によって制御されている。The maximum standard pattern number of each syllable is stored in the maximum syllable pattern number table 18. The CPU 10 controls everything except the voice analysis unit 6.

第３図は前記ステップn8の標準パターン学習処理のさら
に詳細な処理手順を示すフローチャートである。初期の
音声登録で満たされない空の特徴標準パターン領域の認
識貢献度カウンタ１６は、予めすべてＣresの値にセッ
トされている。ここでＣresは、未登録パターンにセッ
トされるカウンタリセット値を示す。FIG. 3 is a flow chart showing a more detailed processing procedure of the standard pattern learning processing in step n8. The recognition contribution counters 16 of the empty feature standard pattern areas which are not satisfied by the initial voice registration are all set to the value of Cres in advance. Here, Cres indicates the counter reset value set in the unregistered pattern.

修正処理部１１ａで確定された音節Ｓinにより各音節の
認識結果を分析し、ステップm1で認識貢献度カウンタ１
６を更新する。たとえば第１表に示すように候補順位に
応じて増減量を与えカウンタの増減を行なう。すなわち
候補順位が第１位の特徴標準パターンが分析結果により
正解の場合は、該特徴標準パターンに対応する認識貢献
度カウンタの値を４だけインクリメントし、分析結果に
より誤りの場合は、該認識貢献度カウントの値を４だけ
デクリメントする。The recognition result of each syllable is analyzed by the syllable Sin determined by the correction processing unit 11a, and the recognition contribution counter 1 is analyzed in step m1.
Update 6 For example, as shown in Table 1, the increase / decrease amount is given according to the candidate rank, and the counter is increased or decreased. That is, when the feature standard pattern having the first candidate rank is correct according to the analysis result, the value of the recognition contribution counter corresponding to the feature standard pattern is incremented by 4, and when the analysis result is incorrect, the recognition contribution is Decrement the degree count value by 4.

たとえば正解音節が「い」の場合の例が第２表に示され
ている。すなわち正解音節「い」の認識の上位候補とな
ったパターンに対して、認識貢献度カウンタＣｊの値を
増減している。ここでＣｊは、標準パターンＰｊに対応
する認識貢献度カウンタを示す。 For example, Table 2 shows an example in which the correct syllable is "i". That is, the value of the recognition contribution counter Cj is increased / decreased with respect to the pattern that has become the top candidate for recognition of the correct syllable “I”. Here, Cj represents a recognition contribution counter corresponding to the standard pattern Pj.

次にステップm2では、更新された認識貢献度カウンタＣ
ｋの値が削除するための閾値Ｃth以下であるか否かが判
断され、そうであるときはステップm3で認識貢献度カウ
ンタＣｋに対応する特徴標準パターンＰｋを削除する。
同時にステップm4で現在の特徴標準パターン数ｎを１減
らし、ステップm5でそのパターン領域の認識貢献度カウ
ンタＣｋをＣrecにリセットする。Next, in step m2, the updated recognition contribution counter C
It is determined whether or not the value of k is less than or equal to the threshold value Cth for deletion, and if so, the characteristic standard pattern Pk corresponding to the recognition contribution counter Ck is deleted in step m3.
At the same time, at step m4, the number n of the current characteristic standard patterns is decreased by 1, and at step m5, the recognition contribution counter Ck of the pattern area is reset to Crec.

次にステップm2で、更新された認識貢献度カウンタＣｋ
の値が削除するための閾値Ｃth以下でないときはステッ
プm6に移り、メモリ１７からの最近の認識状況データに
基づいて、入力された各音節を学習するか否かが判定さ
れる。学習すると判定された場合、ステップm7で確定音
節Ｓinと同一音節名の特徴標準パターンのうち、最小値
を示す認識貢献度カウンタＣｍに対応する特徴標準パタ
ーンＰｍを求める。この特徴標準パターンＰｍは、同一
カテゴリで認識貢献度が最も低いものである。Next, in step m2, the updated recognition contribution counter Ck
When the value of is not equal to or less than the threshold value Cth for deleting, the process proceeds to step m6, and it is determined whether or not to learn each input syllable based on the latest recognition situation data from the memory 17. If it is determined to learn, in step m7, the characteristic standard pattern Pm corresponding to the recognition contribution counter Cm showing the minimum value is obtained among the characteristic standard patterns having the same syllable name as the definite syllable Sin. The feature standard pattern Pm has the lowest recognition contribution in the same category.

次にステップm8で、音節Ｓinに関する現パターン数ｎ
が、予め定めた最大数Ｍ未満か否かが判定される。n<M
の場合はステップm9に移り、前記ステップm7で求めた最
小値を示す認識貢献度カウンタＣｍのカウント値が、空
のパターン領域をもつ認識貢献度カウンタの値Ｃresよ
り大きいか否が判定され、大きいときにはステップm10
で入力特徴パターンＰinを特徴標準パターンＰ(n+1)と
して追加する。そしてステップm11で現パターン数ｎを
１だけインクリメントし、ステップm12で特徴標準パタ
ーンＰ(n+1)に対応する認識貢献度カウンタＣｎの値を
Ｃsetにセットする。ここでＣsetは追加あるいは更新さ
れた特徴標準パターンにセットされるカウンタ値を示
す。Next, at step m8, the current pattern number n for the syllable Sin
Is determined to be less than a predetermined maximum number M. n <M
In the case of, it is judged whether or not the count value of the recognition contribution counter Cm showing the minimum value obtained in the step m7 is larger than the value Cres of the recognition contribution counter having an empty pattern area. Sometimes step m10
The input characteristic pattern Pin is added as the characteristic standard pattern P (n + 1). Then, in step m11, the current pattern number n is incremented by 1, and in step m12, the value of the recognition contribution counter Cn corresponding to the characteristic standard pattern P (n + 1) is set in Cset. Here, Cset indicates a counter value set in the added or updated characteristic standard pattern.

前記ステップm8で、n=Mの場合はステップm13で特徴標準
パターンＰｍを更新する。また前記ステップm9で、前記
ステップm7で求めた最小値を示す認識貢献度カウンタＣ
ｍの値がＣresよりも小さいときには、ステップｍ１３
でその特徴認識パターンＰｍを更新する。更新後も前記
ステップm12と同様にステップｍ１４で、その認識貢献
度カウンタＣｍの値をＣsetにセットする。If n = M in step m8, the characteristic standard pattern Pm is updated in step m13. Further, in step m9, the recognition contribution counter C showing the minimum value obtained in step m7.
When the value of m is smaller than Cres, step m13
Then, the feature recognition pattern Pm is updated. After the update, the value of the recognition contribution counter Cm is set to Cset in step m14 as in step m12.

最後にステップｍ１５で同一音節名に属するパターンの
各認識貢献度カウンタの平均値がカウンタの中央値にな
るように第１式によってカウンタ値をシフトして値のバ
ランス化を図る。これは、累積されう認識貢献度を同一
カテゴリ内で正規化するものである。Finally, in step m15, the counter values are shifted according to the first equation so that the average value of each recognition contribution counter of the patterns belonging to the same syllable name becomes the median value of the counters to balance the values. This normalizes the cumulative recognition contribution within the same category.

Ｃｉ′＝Ｃｉ＋（Ｃｉ／Ｌ−Ｃcen） …（１）Ｃｉ′はバランス化後の認識貢献度カウンタの値を示
し、Ｃｉはバランス化前の認識貢献度カウンタの値を示
し、Ｌは同一音節名の特徴標準パターン数を示し、Ｃce
nはカウンタ変数の中央値を示す。Ci '= Ci + (Ci / L-Ccen) (1) Ci' represents the value of the recognition contribution counter after balancing, Ci represents the value of the recognition contribution counter before balancing, and L is the same syllable. The number of standard patterns of the name
n indicates the median value of the counter variable.

音節Ｓinが「い」で、Ｃth＝０で、Ｃres＝２０で、Ｃs
et＝５０とし、入力された音節の新しい特徴パターンを
「いin」としたときの、特徴標準パターンの追加の例が
第3a表および第3b表に示されている。また更新の例が第
4a表および第4b表に示されている。さらに削除の例が第
5a表および第5b表に示されている。Syllable Sin is “yes”, Cth = 0, Cres = 20, Cs
Tables 3a and 3b show examples of addition of feature standard patterns when et = 50 and the new feature pattern of the input syllable is set to "in". Also, the update example is
It is shown in Tables 4a and 4b. Further deletion examples are
It is shown in Tables 5a and 5b.

(1)追加第３ａ表は学習前の状態を示しており、学習後は第3b表
に示されるように特徴標準パターン「い₅」が追加され
ている。(1) Addition Table 3a shows the state before learning, and after learning, the feature standard pattern “I ₅ ” is added as shown in Table 3b.

(2)更新第４ａ表は学習前の状態を示しており、学習後は第４ｂ
表の第２列に示されるように特徴標準パターン「い₂」
が更新され、かつ第３列に示されるようにバランス化が
行なわれる。 (2) Update Table 4a shows the state before learning, and after learning 4b
As shown in the second column of the table, the characteristic standard pattern "I ₂ "
Are updated and balancing is performed as shown in the third column.

(3)削除第５ａ表は学習前の状態を示しており、学習後は第５ｂ
表の第２列に示されるように特徴標準パターン「い₂」
が削除され、かつ第３列に示されるようにバランス化が
行なわれる。(3) Deletion Table 5a shows the state before learning, and after learning 5b
As shown in the second column of the table, the characteristic standard pattern "I ₂ "
Are removed and balancing is performed as shown in the third column.

次に第４図を参照して、本発明によれば第４図に示すよ
うに、たとえば、学習後の特徴標準パターンの総量がであるときには、ある音節のｊの特徴標準パターン数k
[j]は第４図で示すように本発明によれば、k[j]＝m[j]
であるが、各音節に最大数を設定しない従来の場合には
第４図(2)で示すように、であったりであったりする。 Next, referring to FIG. 4, according to the present invention, as shown in FIG. 4, for example, the total amount of feature standard patterns after learning is , The number of characteristic standard patterns k of j of a certain syllable k
[j] is k [j] = m [j] according to the present invention as shown in FIG.
However, in the conventional case where the maximum number is not set for each syllable, as shown in Fig. 4 (2), Is It is.

このようにして本発明では最低限必要な音節のパターン
を特徴標準パターンとして初期登録すれば、各音節毎に
設定された特徴標準パターンの最大数の範囲内で、認識
貢献度の低い特徴標準パターンを優先して追加／更新／
削除の基本操作を行なうことによって特徴標準パターン
の最適化を図ることができる。In this manner, in the present invention, if the minimum required syllable pattern is initially registered as the characteristic standard pattern, the characteristic standard pattern having a low recognition contribution is within the maximum number of characteristic standard patterns set for each syllable. Add / update /
The characteristic standard pattern can be optimized by performing the basic deletion operation.

第５図は１１種の文章（１文章は平均６５０音節からな
る）を入力したときの各文章毎の音節の平均認識率を示
すグラフであり、第６図は特徴標準パターンの総数およ
び学習された入力音節の個数の推移を示すグラフであ
る。ここでは、初期登録で５９０個の音節パターンを特
徴標準パターンとして登録した後、各音節の特徴標準パ
ターンの最大数（総数１０２０個）を設定して上記の文
章を読み上げた１名の話者について測定した。ただし削
除の処理は行なっていない。入力文章数が増すにつれて
特徴標準パターンが成長して１０２０個に近付いていく
ようすが分かる。たとえば、第２文章を入力したときの
パターン学習回数（第３ずのステップm6の処理を通過し
た音節数）約３００回に対して追加や更新が行なわれて
パターン増加数は約１００個である。文章を多く入力し
ていくと平均認識率が高くなって学習の効果が現れてい
る。FIG. 5 is a graph showing the average recognition rate of syllables for each sentence when 11 kinds of sentences (one sentence consists of 650 syllables on average) are input, and FIG. 6 is the total number of characteristic standard patterns and learned. 6 is a graph showing the transition of the number of input syllables. Here, after registering 590 syllable patterns as feature standard patterns in the initial registration, a maximum number of feature standard patterns of each syllable (1020 in total) was set, and one speaker read the above sentence. It was measured. However, the deletion process is not performed. It can be seen that the feature standard pattern grows and approaches 1020 as the number of input sentences increases. For example, the number of patterns learned when the second sentence is input (the number of syllables that have passed the processing of step m6 in the third step) is added or updated about 300 times, and the number of patterns increased is about 100. . As the number of sentences is input, the average recognition rate becomes higher and the learning effect appears.

発明の効果本発明によれば、特徴標準パターンの類似度の高い順に
候補を順位づけ、その候補順位と修正の有無に依存する
認識貢献度を演算して求め、更新にあたっては、認識貢
献度が最も低いものを優先して行うようにしたので、特
徴標準パターンの最適化を容易に迅速に行うことができ
るようになる。これによって特徴標準パターンをストア
するメモリの容量を少なくすることもまた可能である。EFFECTS OF THE INVENTION According to the present invention, the candidates are ranked in descending order of similarity of the characteristic standard patterns, and the recognition contribution rate depending on the candidate order and the presence or absence of correction is calculated and obtained. Since the lowest one is prioritized, the feature standard pattern can be easily and quickly optimized. This also makes it possible to reduce the memory capacity for storing the characteristic standard pattern.

さらに本発明では、音節毎に特徴標準パターンの最大パ
ターン数を設定する構成を有することによって、入力頻
度が高くかつ統計的に認識率の悪い音節に対する最大パ
ターン数を大きくし、逆に入力頻度の低いものについて
最大パターン数を小さくすることができ、結果的に、少
ない記憶容量で認識率の変動を少なくすることができる
という優れた効果が達成される。Further, in the present invention, the maximum number of patterns of the characteristic standard pattern is set for each syllable, thereby increasing the maximum number of patterns for syllables with high input frequency and statistically poor recognition rate, and conversely The maximum number of patterns can be reduced for low patterns, and as a result, the excellent effect that the variation in recognition rate can be reduced with a small storage capacity is achieved.

さらに本発明によれば、累積される認識貢献度を正規化
することによって、その記憶容量の有効な利用を図るこ
とが可能になる。Further, according to the present invention, by normalizing the accumulated recognition contribution, it becomes possible to effectively utilize the storage capacity.

[Brief description of drawings]

第１図は本発明の一実施例の日本語音声入力装置１の構
成を示すブロック図、第２図は日本語音声入力装置１に
おける音声認識処理の手順を示すフローチャート、第３
図はステップn8の標準パターン学習処理のさらに詳細な
処理手順を示すフローチャート、第４図は本発明と従来
技術を比較した図、第５図は１１種の文章（１文章は平
均６５０音節からなる）を入力したときの各文章毎の音
節の平均認識率を示すグラフ、第６図は特徴標準パター
ンの総数および学習された入力音節の個数の推移を示す
グラフである。１…日本語音声入力装置、２…マイクロホン、６…音声
分析部、７…音節セグメンテーション部、８…特徴ハパ
ターンバッファ、９…特徴パターン、１０…ＣＰＵ、１
１…音節認識部、１１ａ…修正処理部、１１ｂ…学習制
御部、１２…特徴標準パターン、１３…認識結果メモ
リ、１４…言語処理用辞書メモリ、１５…キーボード、
１６…認識貢献度カウンタ、１７…最近の認識状況メモ
リ、１８…音節の最大パターン数テーブル、Ｍ…各音節
の最大パターン数FIG. 1 is a block diagram showing the configuration of a Japanese voice input device 1 according to an embodiment of the present invention, FIG. 2 is a flow chart showing the procedure of voice recognition processing in the Japanese voice input device 1, and FIG.
FIG. 4 is a flowchart showing a more detailed processing procedure of the standard pattern learning processing of step n8. FIG. 4 is a diagram comparing the present invention with the prior art. FIG. 5 is 11 kinds of sentences (one sentence consists of 650 syllables on average). ) Is a graph showing the average recognition rate of syllables for each sentence, and FIG. 6 is a graph showing changes in the total number of characteristic standard patterns and the number of learned input syllables. 1 ... Japanese voice input device, 2 ... Microphone, 6 ... Voice analysis unit, 7 ... Syllable segmentation unit, 8 ... Feature pattern buffer, 9 ... Feature pattern, 10 ... CPU, 1
DESCRIPTION OF SYMBOLS 1 ... Syllable recognition part, 11a ... Correction processing part, 11b ... Learning control part, 12 ... Feature standard pattern, 13 ... Recognition result memory, 14 ... Language processing dictionary memory, 15 ... Keyboard,
16 ... Recognition contribution counter, 17 ... Recent recognition status memory, 18 ... Maximum number of syllable patterns table, M ... Maximum number of patterns of each syllable

Claims

[Claims]

1. A method of calculating the similarity between input voices and the feature standard patterns of a plurality of types of syllables registered in advance, ranks the candidate feature standard patterns in descending order of similarity, and recognizes them in syllable units. In a voice recognition method that corrects the word by comparing it with a dictionary or an external instruction operation using a keyboard to obtain the final input, the recognition contribution rate that depends on the candidate rank and the presence or absence of correction is calculated to obtain each syllable. The maximum number of characteristic standard patterns is set in, and the minimum number of characteristic standard patterns required for recognition is initially registered.When inputting, the input characteristic patterns are automatically detected within the range not exceeding the maximum number of patterns of each syllable. Register and update the characteristic standard pattern.
The one with the lowest recognition contribution is preferentially updated, and that value is shifted so that the average value of the recognition contributions of the standard feature patterns belonging to the same syllable name becomes the median of the recognition contributions. Then, a speech recognition method characterized by normalizing the accumulated recognition contribution.