Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
JPH0721759B2 - Speech recognition response device - Google Patents
[go: Go Back, main page]

JPH0721759B2 - Speech recognition response device - Google Patents

Speech recognition response device

Info

Publication number
JPH0721759B2
JPH0721759B2 JP58091809A JP9180983A JPH0721759B2 JP H0721759 B2 JPH0721759 B2 JP H0721759B2 JP 58091809 A JP58091809 A JP 58091809A JP 9180983 A JP9180983 A JP 9180983A JP H0721759 B2 JPH0721759 B2 JP H0721759B2
Authority
JP
Japan
Prior art keywords
voice
response
word
speed
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP58091809A
Other languages
Japanese (ja)
Other versions
JPS59216242A (en
Inventor
洋一 竹林
英範 篠田
輝彦 浮田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Priority to JP58091809A priority Critical patent/JPH0721759B2/en
Publication of JPS59216242A publication Critical patent/JPS59216242A/en
Publication of JPH0721759B2 publication Critical patent/JPH0721759B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Description

【発明の詳細な説明】 〔発明の技術分野〕 本発明は音声入力による情報処理システムに用いられる
音声認識応答装置に関する。
TECHNICAL FIELD OF THE INVENTION The present invention relates to a voice recognition response device used in an information processing system by voice input.

〔発明の技術的背景とその問題点〕[Technical background of the invention and its problems]

近時、音声認識技術や音声合成技術の発達が目覚まし
く、例えば連続音声認識や不特定話者を対象とした音声
認識が可能となり、また線形予測符号化法を用いた精度
の高い音声合成が可能となっている。また文章を音声に
変換する為の規則合成法に関しても盛んに研究開発され
ている。
Recently, the development of speech recognition technology and speech synthesis technology has been remarkable. For example, continuous speech recognition and speech recognition for unspecified speakers are possible, and highly accurate speech synthesis using the linear predictive coding method is possible. Has become. In addition, research and development have been actively conducted on a rule synthesis method for converting sentences into speech.

しかして、このような技術を用いて、例えば電話公衆回
線を用いて各種のサービスを行う電話音声応答サービス
システムや、銀行等におけるオンライン業務システムの
開発が試行されており、その有用性が注目されている。
ところがこの種のシステムの利用者は不特定多数であ
り、例えば老人や子供等の不慣れな人、あるいは1日に
何回ともなく利用する人が存在する。これにも拘らず、
従来装置にあっては、その音声応答の内容が一様であ
り、またその発話速度も一定である為、人間と機械との
対話が円滑になされていなかった。つまり応答が冗長で
苛立しさが生じたり、或いは応答がわかり難いという問
題が生じた。
Thus, using such a technology, for example, a telephone voice response service system for performing various services using a public telephone line, and an online business system in a bank or the like are being developed, and its usefulness is drawing attention. ing.
However, the number of users of this type of system is unspecified, and for example, there are unfamiliar people such as old people and children, or people who use the system many times a day. Despite this,
In the conventional device, the content of the voice response is uniform and the utterance speed is also constant, so that the dialogue between the human and the machine is not smooth. In other words, the response is redundant and frustrating, or the response is difficult to understand.

〔発明の目的〕[Object of the Invention]

本発明はこのような事情を考慮してなされたもので、そ
の目的とするところは、人間と機械との間の自然で円滑
な対話を可能として効果的な音声入力による情報処理を
可能ならしめる実用性の高い音声認識応答装置を提供す
ることにある。
The present invention has been made in consideration of such circumstances, and an object thereof is to enable natural and smooth dialogue between a human and a machine to enable effective information processing by voice input. It is to provide a highly practical voice recognition response device.

〔発明の概要〕[Outline of Invention]

本発明は入力音声を認識し、この認識結果に対する応答
音声を出力する音声認識応答装置において、前記入力音
声の単語単位の時間長を検出する検出手段と、前記入力
音声の単語を認識する認識手段と、前記入力音声の単語
単位の標準時間長に関する情報を予め登録した記憶手段
と、前記認識手段により検出された単語に関する前記検
出手段により検出された時間長と前記記憶手段に登録さ
れた標準時間長の情報を用いて前記入力音声の発話速度
を測定する測定手段と、この測定手段により測定された
発話速度に応じて前記応答音声速度を制御する制御手段
とを備えたことを特徴とする。
The present invention, in a voice recognition response device for recognizing an input voice and outputting a response voice in response to the recognition result, a detecting means for detecting a time length of the input voice in word units, and a recognizing means for recognizing the word of the input voice. And a storage unit that pre-registers information about a standard time length of the input voice in word units, a time length detected by the detection unit and a standard time registered in the storage unit regarding a word detected by the recognition unit. It is characterized by further comprising: measuring means for measuring the utterance speed of the input voice using the length information; and control means for controlling the response voice speed according to the utterance speed measured by the measuring means.

〔発明の効果〕〔The invention's effect〕

本発明によれば、入力音声の発話速度に応じて応答音声
の速度が制御されるので、音声入力者に対して適切に応
答音声を与えることが可能となる。例えば音声入力者の
話し方が早口な場合は早口形式で、また話し方が遅い場
合はゆっくりした速度で応答がなされることにより、人
間と機械との間の対話の自然性が高くなり、円滑化を図
ることができる。
According to the present invention, since the speed of the response voice is controlled according to the utterance speed of the input voice, it is possible to appropriately give the response voice to the voice input person. For example, when the voice-inputting person speaks quickly, the response is made in a fast-paced manner, and when the speech is slow, the response is made at a slow speed, which enhances the naturalness and smoothness of the dialogue between human and machine. Can be planned.

さらに、本発明では特に入力音声の単語単位の時間長
と、単語単位で登録した標準時間長を用いて入力音声の
発話速度を測定するため、同じ字数の単語でありながら
単語の種類や内容によって発話速度が違っていても、発
話速度を正確に測定することができる。従って、出力の
応答速度の速度を入力音声の発話速度に応じてより適切
に制御することが可能となる。
Furthermore, in the present invention, in particular, the speech length of the input voice is measured using the time length of the input voice in word units and the standard time length registered in word units. Even if the speech rate is different, the speech rate can be accurately measured. Therefore, the output response speed can be controlled more appropriately according to the speech speed of the input voice.

〔発明の実施例〕Example of Invention

以下、図面を参照して本発明の実施例につき説明する。 Embodiments of the present invention will be described below with reference to the drawings.

第1図は第1の実施例装置を示す概略構成図である。こ
の装置は音声の認識対象を単語とし、この単語の発話速
度に応じて音声応答の速度制御を行うものである。即
ち、入力音声は分析器1を介してA/D変換処理、スペク
トル分析処理等が施されてその特徴パラメータの系列に
変換され、音声パターンメモリ2に格納される。音声区
間検出器3は、上記特徴パラメータ時系列の、例えばエ
ネルギー情報を利用して音声パターン中の単語の始端と
終端とを検出するものであり、これによって単語データ
部分が切出される。しかしてパターン照合回路4は、上
記単語データの音声パターンと、単語辞書メモリ5に予
め登録された複数の単語の各標準パターンとを照合し
て、単語を認識している。このパターンの照合は、例え
ば類似度計算法によって行われる。この認識結果が音声
応答出力部6に与えられる。
FIG. 1 is a schematic configuration diagram showing the device of the first embodiment. In this device, a speech recognition target is a word, and the speed of a voice response is controlled according to the speech speed of the word. That is, the input voice is subjected to A / D conversion processing, spectrum analysis processing, and the like via the analyzer 1, converted into a series of characteristic parameters thereof, and stored in the voice pattern memory 2. The voice section detector 3 detects the start and end of a word in a voice pattern by using, for example, energy information of the characteristic parameter time series, and the word data portion is cut out by this. Then, the pattern matching circuit 4 matches the voice pattern of the word data with each standard pattern of a plurality of words registered in the word dictionary memory 5 in advance to recognize the word. This pattern matching is performed by, for example, a similarity calculation method. The recognition result is given to the voice response output unit 6.

一方、パターン照合回路4で求められた入力音声の認識
結果は発話速度測定器7に与えられる。この発話速度測
定器7は、入力音声の認識結果Wiと、前記始端および終
端の情報として示される単語の時間長Liとを用い、単語
継続時間長メモリ8に予め登録されている上記認識単語
Wiの標準時間長Riを求め、その平均値と分散とから発話
速度vを算出するものである。これによって例えば前記
入力音声の発話速度vがその平均的な標準発話速度より
も早いか、或いは遅いかが判定される。換言すれば、こ
れによって音声入力者が所謂早口か、標準的か、遅口か
が判定される。音声応答速度制御器9は、この発話速度
に関する情報を得て前記音声応答出力部6による応答音
声の速度を可変制御するものである。
On the other hand, the recognition result of the input voice obtained by the pattern matching circuit 4 is given to the speech rate measuring device 7. The speech rate measuring device 7 uses the recognition result Wi of the input voice and the time length Li of the word shown as the information of the start end and the end, and uses the recognition word previously registered in the word duration memory 8 for the recognition word.
The standard time length Ri of Wi is obtained, and the utterance speed v is calculated from the average value and the variance thereof. Thus, for example, it is determined whether the speech speed v of the input voice is faster or slower than the average standard speech speed. In other words, this determines whether the voice input person is a so-called quick mouth, standard voice, or late voice. The voice response speed controller 9 variably controls the speed of the response voice by the voice response output unit 6 based on the information about the speech rate.

つまり、入力される音声の発話速度を入力音声全体とし
ての母音の動きやピッチによって決定するようなことは
行わず、言語的中身に依存する意味のある部分、すなわ
ち単語部分の速度によって決定するようになされてい
る。これにより、例えば、重要な部分の単語だけをゆっ
くりと強調して発話された場合、これに応答する音声
(単語)についてもゆっくりと発話するよう制御される
のである。
In other words, the utterance speed of the input voice is not determined by the movement or pitch of the vowel as the entire input voice, but by the speed of the meaningful part that depends on the linguistic content, that is, the word part. Has been done. As a result, for example, when only an important portion of a word is slowly emphasized and uttered, the voice (word) responding to the utterance is controlled to be slowly uttered.

この結果、音声応答出力部6からは、入力音声の認識結
果に応じて決定された応答文の音声出力速度が上記入力
音声の発話速度に応じて可変制御されて音声応答がなさ
れることになる。このとき、規則合成方式によって応答
音声が合成出力される場合には、上記規則合成の為の種
種のパラメータの変化速度を制御することによって応答
音声速度が可変制御される。また録音編集形の音声合成
が行われる場合には、予め記録された発話速度の異なる
文章や音声素片を選択する等して、その応答音声速度の
制御が行われる。
As a result, the voice response output unit 6 variably controls the voice output speed of the response sentence determined in accordance with the recognition result of the input voice in accordance with the utterance speed of the input voice to provide a voice response. . At this time, when the response voice is synthesized and output by the rule synthesis method, the response voice speed is variably controlled by controlling the changing speed of various parameters for the rule synthesis. Further, in the case of performing the voice synthesis of the recording edit type, the response voice speed is controlled by selecting a prerecorded sentence or voice unit having a different utterance speed.

かくして、このように構成された本装置によれば、音声
入力者の発話速度に応じた発話速度で音声応答が行われ
るので、所謂せっかちで早口な人に対しては早口形式
で、またのんびり型で遅口な人に対しては緩やかな速度
で音声応答することが可能となり、ここに人間と機械と
の間の対話の自然性を高め、その円滑化を図ることが可
能となる。この結果、総合的には音声認識応答による情
報処理効率の向上を図ることが可能となる。
Thus, according to the present device configured as described above, since the voice response is performed at the speaking speed according to the speaking speed of the voice input person, the so-called impatient and fast-paced type can be used for the so-called impatient and quick-mouthed person, and the leisurely type can also be used. With this, it becomes possible to make a voice response to a slow-moving person at a slow speed, and it becomes possible to enhance the naturalness of the dialogue between the human and the machine and smooth the dialogue. As a result, it is possible to comprehensively improve the information processing efficiency by the voice recognition response.

このように本発明によれば、音声入力者の性格を良く反
映する音声発話速度を検出し、これに応じて音声応答の
速度を制御するので、音声入力者との間の対話の自然性
を高めることができる。この結果、上記音声入力者に苛
立たしさを与える等の不具合が無くなる等の実用上多大
なる効果が奏せられる。
As described above, according to the present invention, the voice utterance speed that well reflects the character of the voice input person is detected, and the speed of the voice response is controlled accordingly, so that the naturalness of the dialogue with the voice input person can be improved. Can be increased. As a result, practically great effects such as elimination of problems such as irritation to the voice input person can be obtained.

【図面の簡単な説明】[Brief description of drawings]

第1図は本発明の一実施例に係る音声認識応答装置の概
略構成図である。 1……分析器 2……音声パターンメモリ 3……音声区間検出器 4……パターン照合回路 5……単語辞書メモリ 6……音声応答出力部 7……発話速度測定器 8……単語継続時間長メモリ 9……音声応答速度制御器
FIG. 1 is a schematic configuration diagram of a voice recognition response device according to an embodiment of the present invention. 1 …… Analyzer 2 …… Voice pattern memory 3 …… Voice section detector 4 …… Pattern matching circuit 5 …… Word dictionary memory 6 …… Voice response output unit 7 …… Speech rate measuring instrument 8 …… Word duration Long memory 9 ... Voice response speed controller

───────────────────────────────────────────────────── フロントページの続き (72)発明者 浮田 輝彦 神奈川県川崎市幸区小向東芝町1番地 東 京芝浦電気株式会社総合研究所内 (56)参考文献 特開 昭59−153238(JP,A) 特開 昭57−57375(JP,A) ─────────────────────────────────────────────────── ─── Continuation of front page (72) Inventor Teruhiko Ukita 1 Komukai Toshiba-cho, Kouki-ku, Kawasaki-shi, Kanagawa Higashi Koshibaura Electric Co., Ltd. (56) Reference JP-A-59-153238 (JP, A) ) JP-A-57-57375 (JP, A)

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】入力音声を認識し、この認識結果に対する
応答音声を出力する音声認識応答装置において、 前記入力音声の単語部分を切出し、切出された単語単位
の時間長を検出する検出手段と、 前記入力音声の単語を認識する認識手段と、 前記入力音声の単語単位の標準時間長に関する情報を予
め登録した記憶手段と、 前記認識手段で認識される単語に関する前記検出手段に
おいて検出された時間長と、前記記憶手段に登録された
標準時間長の情報とを用いて前記入力音声の発話速度を
測定する測定手段と、 この測定手段により測定された発話速度に応じて、前記
応答音声の出力速度を制御する制御手段とを備えたこと
を特徴とする音声認識応答装置。
1. A voice recognition response device for recognizing an input voice and outputting a response voice in response to the recognition result, comprising: a detecting means for extracting a word portion of the input voice and detecting a time length of the extracted word unit. A recognition unit that recognizes a word of the input voice, a storage unit that pre-registers information regarding a standard time length of the input voice in word units, and a time that is detected by the detection unit regarding a word recognized by the recognition unit. Measuring means for measuring the utterance speed of the input voice using the length and information of the standard time length registered in the storage means, and outputting the response voice in accordance with the utterance speed measured by the measuring means. A voice recognition response device, comprising: a control unit for controlling a speed.
JP58091809A 1983-05-25 1983-05-25 Speech recognition response device Expired - Lifetime JPH0721759B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58091809A JPH0721759B2 (en) 1983-05-25 1983-05-25 Speech recognition response device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58091809A JPH0721759B2 (en) 1983-05-25 1983-05-25 Speech recognition response device

Related Child Applications (1)

Application Number Title Priority Date Filing Date
JP4139390A Division JPH0731508B2 (en) 1992-05-29 1992-05-29 Speech recognition response device

Publications (2)

Publication Number Publication Date
JPS59216242A JPS59216242A (en) 1984-12-06
JPH0721759B2 true JPH0721759B2 (en) 1995-03-08

Family

ID=14036950

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58091809A Expired - Lifetime JPH0721759B2 (en) 1983-05-25 1983-05-25 Speech recognition response device

Country Status (1)

Country Link
JP (1) JPH0721759B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008026463A (en) * 2006-07-19 2008-02-07 Denso Corp Spoken dialogue device
JP2012128440A (en) * 2012-02-06 2012-07-05 Denso Corp Voice interactive device

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0631998B2 (en) * 1985-12-20 1994-04-27 株式会社東芝 Voice notification device
JP2677573B2 (en) * 1987-12-25 1997-11-17 株式会社東芝 Pattern generation method
DE19941227A1 (en) * 1999-08-30 2001-03-08 Philips Corp Intellectual Pty Method and arrangement for speech recognition
JP5326533B2 (en) 2008-12-09 2013-10-30 富士通株式会社 Voice processing apparatus and voice processing method
US10157607B2 (en) 2016-10-20 2018-12-18 International Business Machines Corporation Real time speech output speed adjustment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5757375A (en) * 1981-08-06 1982-04-06 Noriko Ikegami Electronic translator
JPS59153238A (en) * 1983-02-21 1984-09-01 Nec Corp Voice input/output system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008026463A (en) * 2006-07-19 2008-02-07 Denso Corp Spoken dialogue device
JP2012128440A (en) * 2012-02-06 2012-07-05 Denso Corp Voice interactive device

Also Published As

Publication number Publication date
JPS59216242A (en) 1984-12-06

Similar Documents

Publication Publication Date Title
JP4536323B2 (en) Speech-speech generation system and method
US12272349B2 (en) Attention-based clockwork hierarchical variational encoder
CN103617799B (en) A kind of English statement pronunciation quality detection method being adapted to mobile device
US12536988B2 (en) Speech synthesis method and apparatus, device, and storage medium
CN102779508B (en) Sound bank generates Apparatus for () and method therefor, speech synthesis system and method thereof
WO2021118793A1 (en) Speech processing
JPH08263097A (en) Method for recognition of word of speech and system for discrimination of word of speech
JPH0721759B2 (en) Speech recognition response device
CN116013248A (en) Rap audio generation method, device, device and readable storage medium
JPS6138479B2 (en)
Prasangini et al. Sinhala speech to sinhala unicode text conversion for disaster relief facilitation in sri lanka
JPH05173589A (en) Speech recognizing and answering device
Rapp Automatic labelling of German prosody.
JPH0774960B2 (en) Method and system for keyword recognition using template chain model
Malik et al. Efficacy of current dysarthric speech recognition techniques
JP3110025B2 (en) Utterance deformation detection device
Miyazaki et al. Connectionist temporal classification-based sound event encoder for converting sound events into onomatopoeic representations
JPH0455518B2 (en)
JP2760096B2 (en) Voice recognition method
Vyas et al. Study of Speech Recognition Technology and its Significance in Human-Machine Interface
Sharma Implementation of ZCR and STE techniques for the detection of the voiced and unvoiced signals in Continuous Punjabi Speech
Deekshitha et al. Implementation of Automatic segmentation of speech signal for phonetic engine in Malayalam
JP2578771B2 (en) Voice recognition device
JPH0667685A (en) Speech synthesizing device
JPH0424720B2 (en)