Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
JPH0314359B2 - - Google Patents
[go: Go Back, main page]

JPH0314359B2 - - Google Patents

Info

Publication number
JPH0314359B2
JPH0314359B2 JP58221097A JP22109783A JPH0314359B2 JP H0314359 B2 JPH0314359 B2 JP H0314359B2 JP 58221097 A JP58221097 A JP 58221097A JP 22109783 A JP22109783 A JP 22109783A JP H0314359 B2 JPH0314359 B2 JP H0314359B2
Authority
JP
Japan
Prior art keywords
dictionary
display
pattern
registered
prompt
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP58221097A
Other languages
Japanese (ja)
Other versions
JPS60113298A (en
Inventor
Toyoshi Yamada
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP58221097A priority Critical patent/JPS60113298A/en
Publication of JPS60113298A publication Critical patent/JPS60113298A/en
Publication of JPH0314359B2 publication Critical patent/JPH0314359B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】 〔発明の技術分野〕 本発明は、学習機能を持ち、リジエクトや類似
語有、発声にばらつき有の場合には自動的に同じ
プロンプトを再表示するようにした特定話者音声
認識装置に関するものである。
[Detailed Description of the Invention] [Technical Field of the Invention] The present invention has a learning function and automatically redisplays the same prompt when there are rejects, similar words, or variations in pronunciation. The present invention relates to a user voice recognition device.

〔従来技術と問題点〕[Conventional technology and problems]

音声認識装置において、認識率を高める重要な
要素のひとつに、入力する音声パターンと比較照
合する辞書パターンの最適化が挙げられる。
In speech recognition devices, one of the important factors for increasing the recognition rate is optimizing dictionary patterns to be compared and matched with input speech patterns.

特定話者音声認識装置では、この辞書パターン
の最適化のために学習機能を有し、オペレータ
は、音声登録時に同一語を数回発声して装置に学
習をさせる。高度な学習機能では、ひとつの語に
対し、辞書テンプレートが複数個用意されてい
て、辞書の修正(平均化)、追加、或いは削除/
追加を行う。
The specific speaker speech recognition device has a learning function for optimizing this dictionary pattern, and the operator makes the device learn by uttering the same word several times during voice registration. With the advanced learning function, multiple dictionary templates are prepared for one word, and you can modify (average), add, or delete the dictionary.
Make additions.

音声登録には、初期登録モードと学習モードが
あり、初期登録は、すべての登録語に対し一通り
発声し、辞書のイニシヤル・パターンを作成する
ものであり、学習は、初期登録後、即ちイニシヤ
ル・パターンが作成された後に行われ、或る語の
発声に対し認識処理を実行し、その認識結果の順
位、距離情報をもとにリジエクト、辞書パターン
の修正、或いは追加を行う。さらに学習が進んで
行けば、辞書パターンの削除/追加も行う。
There are two modes for voice registration: initial registration mode and learning mode. Initial registration involves uttering all the registered words once to create an initial pattern for the dictionary. Learning is performed after initial registration, that is, after initial registration. - Performed after a pattern is created, recognition processing is executed for the utterance of a certain word, and based on the ranking and distance information of the recognition result, rejects, corrections, or additions to the dictionary pattern are performed. As learning progresses further, dictionary patterns are deleted/added.

音声登録では、オペレータにとつて、できるだ
け少ない発声回数で効率的に辞書を完成させるの
が望ましく、また余計なキー操作や判断を回避さ
せて発声に専念させるのが望ましい。
In voice registration, it is desirable for the operator to efficiently complete the dictionary with as few utterances as possible, and it is also desirable for the operator to avoid unnecessary key operations and judgments and concentrate on utterance.

〔発明の目的〕[Purpose of the invention]

本発明は、上記の考察に基づくものであつて、
学習機能を持つた特定話者音声認識装置におい
て、自動的に次に発声すべき語を順次オペレータ
に表示し、オペレータの判断やキー操作を極力少
なくし、オペレータの負担の軽減を図つた特定話
者音声認識装置を提供することを目的とするもの
である。
The present invention is based on the above considerations, and includes:
A specific speaker speech recognition device with a learning function automatically displays the next word to be uttered to the operator in sequence, minimizing operator judgment and key operations, thereby reducing the burden on the operator. The purpose of this invention is to provide a voice recognition device for users.

〔発明の構成〕[Structure of the invention]

そのために本発明の特定話者音声認識装置は、 登録語の文字列が格納されると共に、登録語に
対する音声の辞書パターンが登録される辞書メモ
リと、 入力パターンと辞書パターンとを照合して認識
結果と入力パターンの修正情報を出力する照合部
と、 デイスプレイと、 辞書メモリの辞書パターンの追加登録や修正を
行う辞書制御修正部と、 照合部から出力された認識結果と入力パターン
の修正情報とが入力される学習機構部と、 学習機構部からの指示に従つて、辞書メモリか
ら登録語の文字列を読出して登録語の音声入力を
要求するプロンプトをデイスプレイに表示させる
プロンプト送出制御部と を具備した特定話者音声認識装置であつて、 学習機構部は、 辞書制御修正部を制御して、リジエクトや辞書
パターンの追加、辞書パターンの修正等の処理を
実行し、 再発声の必要があるか否かを判断し、 再発声の必要がない場合には、次の登録語の音
声入力を要求するプロンプトをデイスプレイに表
示させるための制御を行い、 再発声の必要がある場合には、同じ登録語の音
声入力を要求するプロンプトをデイスプレイに再
度表示させるための制御を行うと共に、再発声が
必要な理由を表す補助情報をデイスプレイに表示
させるための制御を行うよう構成されている ことを特徴とするものである。
To this end, the specific speaker speech recognition device of the present invention performs recognition by comparing input patterns and dictionary patterns with a dictionary memory in which character strings of registered words are stored and dictionary patterns of speech for registered words are registered. A matching unit that outputs the results and input pattern correction information, a display, a dictionary control correction unit that performs additional registration and correction of dictionary patterns in the dictionary memory, and a recognition result and input pattern correction information output from the matching unit. a learning mechanism unit into which is input, and a prompt sending control unit which reads out the character string of the registered word from the dictionary memory and displays on the display a prompt requesting voice input of the registered word according to instructions from the learning mechanism unit. The learning mechanism section controls the dictionary control and correction section to perform processes such as rejecting, adding dictionary patterns, and correcting dictionary patterns, and is necessary for re-voicing. If there is no need to re-speak, control is performed to display a prompt on the display requesting voice input of the next registered word, and if re-speak is necessary, the same voice input is performed. It is characterized by being configured to perform control to display on the display again a prompt requesting voice input of the registered word, and to perform control to display on the display auxiliary information indicating the reason why re-voicing is necessary. That is.

〔発明の実施例〕[Embodiments of the invention]

以下、本発明の実施例を図面を参照しつつ説明
する。
Embodiments of the present invention will be described below with reference to the drawings.

図は本発明の1実施例構成を示す図である。図
において、1は入力パターン・バツフア、2は辞
書制御修正部、3は辞書メモリ、4は照合部、5
はプロンプト送出制御部、6は学習機構部、7は
デイスプレイを示す。入力パターン・バツフア1
は、オペレータによつて音声入力された入力パタ
ーンを蓄えるものであり、照合部4は、入力パタ
ーン・バツフア1に蓄えられた入力パターンと辞
書メモリ3に格納された辞書パターンとを照合
し、学習モード時には、その照合結果として、認
識結果(複数の候補とその距離)及び入力パタ
ーンの修正情報をそれぞれ学習機構部6に送
る。辞書メモリ3は、テーブルと辞書パターンよ
りなり、テーブルには登録語の文字列が格納さ
れ、辞書パターンには登録語に対する複数の辞書
テンプレートが用意され音声のパターンが登録さ
れる。初期登録では、テーブルのすべての登録語
に対して一通り発声し、そのパターンが辞書パタ
ーンに登録され、学習では、登録語に対して発声
した入力パターンと辞書パターンとの照合が行わ
れ、照合結果に応じて、リジエクト、辞書パター
ンの追加や辞書パターンの修正が行われる。辞書
メモリ6に対するこれらの処理は、学習機構部6
からの指示をもとに辞書制御修正部2により実
行される。学習機構部6は、正解(発声した語)
の距離情報、正解と他の語との間の距離差、及び
辞書テンプレートの空き情報により、先に述べた
ように、辞書制御修正部2を制御してリジエク
ト、辞書パターンの追加、或いは辞書パターンの
修正のいずれかの処理を実行する。リジエクト
は、学習を行わないようにする処理であり、辞書
パターンの追加は、入力パターンをそのまま追加
辞書パターンとする処理であり、辞書パターンの
修正は、辞書パターンを平均化して入れ換える処
理である。また、この学習結果に関連して学習機
構部6は、オペレータに対して次に何を発声すべ
きかをデイスプレイ7に表示するための処理を行
う。学習機構部6は、プロンプト送出制御部5に
対し、辞書メモリ3のテーブルの文字列を順次デ
イスプレイ7に表示させるが、例えば次のない
しのケースが起きた場合には再発声が必要とし
てプロンプトを次に進ませないで、同じプロンプ
トを再度表示させる。() リジエクトされた時(正常な音声入力と見な
されなかつた時)。
The figure is a diagram showing the configuration of one embodiment of the present invention. In the figure, 1 is an input pattern buffer, 2 is a dictionary control correction section, 3 is a dictionary memory, 4 is a collation section, and 5 is a dictionary control correction section.
6 is a prompt sending control section, 6 is a learning mechanism section, and 7 is a display. Input pattern buffer 1
stores the input patterns voice input by the operator, and the collation unit 4 collates the input patterns stored in the input pattern buffer 1 with the dictionary patterns stored in the dictionary memory 3, and performs learning. In the mode, the recognition results (a plurality of candidates and their distances) and input pattern modification information are sent to the learning mechanism section 6 as the matching results. The dictionary memory 3 consists of a table and a dictionary pattern. The table stores character strings of registered words, and the dictionary pattern stores a plurality of dictionary templates for registered words and registers speech patterns. In initial registration, all the registered words in the table are uttered once, and the patterns are registered in the dictionary pattern. During learning, the input pattern uttered for the registered words is compared with the dictionary pattern, and the matching is performed. Depending on the results, rejects, dictionary patterns are added, and dictionary patterns are modified. These processes for the dictionary memory 6 are carried out by the learning mechanism section 6.
This is executed by the dictionary control correction unit 2 based on instructions from the dictionary control correction unit 2. Learning mechanism part 6 is correct (uttered word)
As described above, the dictionary control correction unit 2 is controlled to reject, add a dictionary pattern, or modify a dictionary pattern based on the distance information of , the distance difference between the correct answer and other words, and the dictionary template free information. Perform one of the following corrections. Rejecting is a process of not performing learning, adding a dictionary pattern is a process of using an input pattern as an additional dictionary pattern, and modifying a dictionary pattern is a process of averaging and replacing dictionary patterns. Furthermore, in relation to this learning result, the learning mechanism unit 6 performs processing for displaying on the display 7 what the operator should say next. The learning mechanism unit 6 causes the prompt sending control unit 5 to sequentially display the character strings in the table in the dictionary memory 3 on the display 7, but if, for example, the following cases occur, the prompt is required to be re-voiced. Display the same prompt again without proceeding. () When it is rejected (when it is not considered normal audio input).

正解が第1位で認識されない時(類似語有)。 When the correct answer is not recognized in the first place (similar words exist).

正解が第1位で認識されても第2位候補との
距離が近い時、即ち類似語が存在する時(類似
語有)。
Even if the correct answer is recognized as the first candidate, the distance to the second candidate is close, that is, when similar words exist (similar words exist).

正解が第1位で認識されてもその距離がある
閾値より大きい時、即ち発声がばらつきやすい
語(発声にばらつき有)。
Even if the correct answer is recognized in first place, when the distance is greater than a certain threshold, that is, a word whose pronunciation tends to vary (there is variation in pronunciation).

さらに、学習機構部5は、なぜ再発声が必要な
のかをオペレータに通知するため、補助情報とし
て次のとの情報をデイスプレイ7に表示させ
る。()。
Furthermore, the learning mechanism section 5 causes the display 7 to display the following information as auxiliary information in order to notify the operator why re-voicing is necessary. ().

認識結果の第4位程度までの候補。 Candidates ranked up to about 4th place in recognition results.

リジエクト、類似語有、発声にばらつき有の
いずれかの情報。
Information on whether there is a redirect, a similar word exists, or a variation in pronunciation.

〔発明の効果〕〔Effect of the invention〕

以上の説明から明らかなように、本発明によれ
ば、学習機能を持つた特定話者音声認識装置にお
いて、学習を行つていく過程で、認識装置自体が
どこまで(何回)発声すればよいかを自動的に判
断してプロンプトを表示するので、オペレータ
は、余計な判断、キー操作無しにただプロンプト
に従つて発声していけばよく、オペレータの負担
の軽減や学習効率の向上を図ることができる。
As is clear from the above description, according to the present invention, in a specific speaker speech recognition device having a learning function, in the process of learning, how far (how many times) should the recognition device itself utter? Since the system automatically determines and displays prompts, the operator can simply follow the prompts and speak without making unnecessary judgments or key operations, reducing the operator's burden and improving learning efficiency. can.

【図面の簡単な説明】[Brief explanation of the drawing]

図は本発明の1実施例構成を示す図である。 1……入力パラメータ・バツフア、2……辞書
制御修正部、3……辞書メモリ、4……照合部、
5……プロンプト送出制御部、6……学習機構
部、7……デイスプレイ。
The figure shows the configuration of one embodiment of the present invention. 1...Input parameter buffer, 2...Dictionary control correction section, 3...Dictionary memory, 4...Verification section,
5... Prompt sending control section, 6... Learning mechanism section, 7... Display.

Claims (1)

【特許請求の範囲】 1 登録語の文字列が格納されると共に、登録語
に対する音声の辞書パターンが登録される辞書メ
モリと、 入力パターンと辞書パターンとを照合して認識
結果と入力パターンの修正情報を出力する照合部
と、 デイスプレイと、 辞書メモリの辞書パターンの追加登録や修正を
行う辞書制御修正部と、 照合部から出力された認識結果と入力パターン
の修正情報とが入力される学習機構部と、 学習機構部からの指示に従つて、辞書メモリか
ら登録語の文字列を読出して登録語の音声入力を
要求するプロンプトをデイスプレイに表示させる
プロンプト送出制御部と を具備した特定話者音声認識装置であつて、 学習機構部は、 辞書制御修正部を制御して、リジエクトや辞書
パターンの追加、辞書パターンの修正等の処理を
実行し、 再発声の必要があるか否かを判断し、 再発声の必要がない場合には、次の登録語の音
声入力を要求するプロンプトをデイスプレイに表
示させるための制御を行い、 再発声の必要がある場合には、同じ登録語の音
声入力を要求するプロンプトをデイスプレイに再
度表示させるための制御を行うと共に、再発声が
必要な理由を表す補助情報をデイスプレイに表示
させるための制御を行うよう構成されている ことを特徴とする特定話者音声認識装置。 2 補助情報が第1順位から第N順位(Nは2以
上の整数)までの認識結果を含むことを特徴とす
る特許請求の範囲第1項記載の特定話者音声認識
装置。
[Scope of Claims] 1. Correcting the recognition result and input pattern by comparing the input pattern and the dictionary pattern with a dictionary memory in which character strings of registered words are stored and dictionary patterns of speech for the registered words are registered. A collation unit that outputs information, a display, a dictionary control correction unit that additionally registers and corrects dictionary patterns in the dictionary memory, and a learning mechanism that receives recognition results output from the collation unit and input pattern correction information. and a prompt sending control unit that reads the character string of the registered word from the dictionary memory and displays on the display a prompt requesting voice input of the registered word according to instructions from the learning mechanism unit. In the recognition device, the learning mechanism section controls the dictionary control correction section to perform processing such as rejecting, adding a dictionary pattern, and correcting the dictionary pattern, and determines whether re-voicing is necessary. , If there is no need to re-speak, a prompt is displayed on the display to request voice input of the next registered word, and if re-voice is required, the voice input of the same registered word is performed. A specific speaker's voice characterized by being configured to perform control to display the requested prompt on a display again, and to perform control to display on the display auxiliary information indicating the reason why the re-speech is necessary. recognition device. 2. The specific speaker speech recognition device according to claim 1, wherein the auxiliary information includes recognition results from the first rank to the Nth rank (N is an integer of 2 or more).
JP58221097A 1983-11-24 1983-11-24 Voice registration system for specified speaker's voice recognition equipment Granted JPS60113298A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58221097A JPS60113298A (en) 1983-11-24 1983-11-24 Voice registration system for specified speaker's voice recognition equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58221097A JPS60113298A (en) 1983-11-24 1983-11-24 Voice registration system for specified speaker's voice recognition equipment

Publications (2)

Publication Number Publication Date
JPS60113298A JPS60113298A (en) 1985-06-19
JPH0314359B2 true JPH0314359B2 (en) 1991-02-26

Family

ID=16761438

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58221097A Granted JPS60113298A (en) 1983-11-24 1983-11-24 Voice registration system for specified speaker's voice recognition equipment

Country Status (1)

Country Link
JP (1) JPS60113298A (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6163768A (en) * 1998-06-15 2000-12-19 Dragon Systems, Inc. Non-interactive enrollment in speech recognition

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5766497A (en) * 1980-10-14 1982-04-22 Fujitsu Ltd Voice registration pattern adding system
JPS5766498A (en) * 1980-10-14 1982-04-22 Fujitsu Ltd Reference voice registration pattern automatic deleting system
JPS5766499A (en) * 1980-10-14 1982-04-22 Fujitsu Ltd Reference voice registration pattern learning system

Also Published As

Publication number Publication date
JPS60113298A (en) 1985-06-19

Similar Documents

Publication Publication Date Title
JP4446312B2 (en) Method and system for displaying a variable number of alternative words during speech recognition
US8355920B2 (en) Natural error handling in speech recognition
EP0376501B1 (en) Speech recognition system
EP0773532B1 (en) Continuous speech recognition
JPH10187406A (en) Method and system for buffering word recognized during speech recognition
JPH10133684A (en) Method and system for selecting alternative word during speech recognition
JP3124277B2 (en) Speech recognition system
JPH11161464A (en) Japanese sentence preparing device
JPH0314359B2 (en)
JP2003044085A (en) Dictation device with command input function
JPH09230889A (en) Voice recognition response device
JPH05108091A (en) Speech recognition device
JPH0217038B2 (en)
JP3100208B2 (en) Voice recognition device
JPH11102196A (en) Voice interaction system, voice interaction method, and recording medium
JP2002082688A (en) Speaker adaptation apparatus, speaker adaptation method, computer-readable recording medium recording speaker adaptation program, speech recognition device, speech recognition method, and computer-readable recording medium recording speech recognition program
JPH0573094A (en) Continuous speech recognizing method
JPS5864548A (en) Japanese voice processing system
JPH10124085A (en) Voice recognition device and recognition method
JPS6311696B2 (en)
JPS59117632A (en) Voice input system
JPS6281699A (en) Dictionary creation and updating method in audio word processing device
JPH01191199A (en) Voice input device
JPS59144946A (en) Voice typewriter control method
JPS60170885A (en) Monosyllabic voice learning system