Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
JPH0242238B2 - - Google Patents
[go: Go Back, main page]

JPH0242238B2 - - Google Patents

Info

Publication number
JPH0242238B2
JPH0242238B2 JP58025069A JP2506983A JPH0242238B2 JP H0242238 B2 JPH0242238 B2 JP H0242238B2 JP 58025069 A JP58025069 A JP 58025069A JP 2506983 A JP2506983 A JP 2506983A JP H0242238 B2 JPH0242238 B2 JP H0242238B2
Authority
JP
Japan
Prior art keywords
syllable
length
boundary
speech
average
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP58025069A
Other languages
Japanese (ja)
Other versions
JPS59149400A (en
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed filed Critical
Priority to JP58025069A priority Critical patent/JPS59149400A/en
Publication of JPS59149400A publication Critical patent/JPS59149400A/en
Publication of JPH0242238B2 publication Critical patent/JPH0242238B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】 <技術分野> 本発明は音声入力装置における音節境界選択方
式の改良に関し、更に詳細には音声入力装置にお
いて、発声速度に応じて音節境界を決定し得るよ
うにしたものである。
[Detailed Description of the Invention] <Technical Field> The present invention relates to an improvement of a syllable boundary selection method in a voice input device, and more specifically, to a voice input device that is capable of determining syllable boundaries according to the speaking rate. It is.

<従来技術> 一般に連続的に発声された音声から音節部を抽
出して識別を行なう方法では、音節部のセグメン
テーシヨンの正確さが認識性能を大きく左右す
る。
<Prior Art> In general, in a method of extracting and identifying syllables from continuously uttered speech, the accuracy of segmentation of the syllables greatly influences recognition performance.

従来のセグメンテーシヨン方法においては発声
速度が変化するとセグメンテーシヨン誤り数も変
化する問題点があつた。これはセグメンテーシヨ
ンのアルゴリズムが発声速度に関係なく固定され
ていることに帰因している。
Conventional segmentation methods have a problem in that the number of segmentation errors changes as the speaking speed changes. This is due to the fact that the segmentation algorithm is fixed regardless of the speaking speed.

<目 的> 本発明は上記の点に鑑みてなされたものであ
り、連続音声の発声速度を推定し、音節境界検出
部から出力される音節境界候補の中から推定され
た発声速度にもとずいて音節境界を決定するよう
にした音声入力装置を提供することを目的として
いる。
<Purpose> The present invention has been made in view of the above points, and it estimates the speech rate of continuous speech and calculates the speech rate based on the estimated speech rate from among the syllable boundary candidates output from the syllable boundary detection section. It is an object of the present invention to provide a speech input device that determines syllable boundaries by selecting the syllable boundaries.

<実施例> 以下、図面を参照して本発明を詳細に説明す
る。
<Example> Hereinafter, the present invention will be described in detail with reference to the drawings.

第1図は本発明を実施した音声入力装置の全体
構成を示すブロツク図である。
FIG. 1 is a block diagram showing the overall configuration of a voice input device embodying the present invention.

第1図において、入力された音声は、音声分析
部1において、入力時刻tにおける音声信号から
パワーp(t)、スペクトルy(t)等の特徴パラ
メータが抽出される。この音声分析部1において
抽出された特徴パラメータが発声速度検出部2に
入力され、該発声速度検出部2内の無音区間検出
部21及び有音区間検出部22によつて入力され
たパラメータのパワーp(t)の強弱等にもとず
いて有音区間及び無音区間が区別される。
In FIG. 1, a voice analysis unit 1 extracts characteristic parameters such as power p(t) and spectrum y(t) from the voice signal at input time t. The characteristic parameters extracted in the speech analysis section 1 are input to the speech rate detection section 2, and the power of the parameters input by the silent section detection section 21 and the voiced section detection section 22 in the speech rate detection section 2 is input to the speech rate detection section 2. A sound section and a silent section are distinguished based on the strength of p(t).

また発声速度検出部2内の発声速度推定部23
によつて音節数が既知である訓練用文章の音声入
力の有音区間の継続時間にもとずいて平均音節長
Lが推定され出力される。
Also, the speaking rate estimation unit 23 in the speaking rate detecting unit 2
The average syllable length L is estimated and output based on the duration of the voiced section of the speech input of a training sentence whose number of syllables is known.

即ち、音声入力装置を使用する時に、最初に音
節数が既知である訓練用文章をユーザが発話して
発声速度推定部23において平均音節長(1/
平均発声速度)を推定することになる。
That is, when using a voice input device, the user first utters a training sentence whose number of syllables is known, and the speech rate estimator 23 calculates the average syllable length (1/
The average speaking rate) will be estimated.

今、音節数がn個含まれる文章を発話した際の
有音区間検出部22において検出されたi番目の
有音区間の継続時間をL(i)とすると(ただし
i=1、2、…、m)、発声速度推定部23にお
いて 平均音節長 =1/2ni=1 L(i) が算出され出力される。
Now, let L(i) be the duration of the i-th voiced interval detected by the voiced interval detection unit 22 when a sentence containing n syllables is uttered (where i=1, 2,... , m), the average syllable length = 1/2 ni=1 L(i) is calculated and output in the speech rate estimation unit 23.

文節境界検出部3では無音区間検出部21にお
いて検出された無音区間の継続時間にもとずい
て、無音区間の継続時間長が所定の長さを越えて
いる場合を検出して、その無音区間を文節境界と
みなしてその旨を出力する。
Based on the duration of the silent section detected by the silent section detection section 21, the phrase boundary detection section 3 detects when the duration of the silent section exceeds a predetermined length, and detects the silent section. is regarded as a bunsetsu boundary and outputs a message to that effect.

音節境界検出部4では上記文節境界検出部3に
よつて文節毎に区切られた音声を単位として、音
声分析部1で抽出された特徴パラメータを用いて
音節境界の候補を出力する(音節境界間の間隔が
音節長となる)。この音節境界検出部4において、
第2図に示すように時刻t1と時刻t3において、音
節境界が明確に検出されたが、時刻t2において音
節境界が存在するか否かを決定し難い場合がある
が、このような場合には、音節境界の最終決定は
音節境界選択部5が行なう。
The syllable boundary detection unit 4 outputs syllable boundary candidates using the feature parameters extracted by the speech analysis unit 1, using the speech segmented into phrases by the phrase boundary detection unit 3 as a unit. The interval between is the syllable length). In this syllable boundary detection unit 4,
As shown in Figure 2, a syllable boundary was clearly detected at time t 1 and time t 3 , but it may be difficult to determine whether or not a syllable boundary exists at time t 2 . In this case, the final determination of syllable boundaries is made by the syllable boundary selection unit 5.

音節境界選択部5は音節境界検出部4において
検出された音節境界の候補の音節長と発声速度推
定部23により推定された平均音節長とを比較
して音節境界を決定する。
The syllable boundary selection unit 5 determines a syllable boundary by comparing the syllable length of the syllable boundary candidate detected by the syllable boundary detection unit 4 with the average syllable length estimated by the speech rate estimation unit 23.

今、第2図に示す例において、もし時刻t2が音
節境界でないならば、時間領域t1<t<t3におい
て長さt3−t1(図中A1の長さ)の音節が存在す
ることになり、もし音節境界ならば、長さt2−t1
(図中B1の長さ)と長さt3−t2(図中B2の長
さ)の音節が存在することになるが、音節境界選
択部5はこれらの音節長の候補A1,B1,B2
と平均音節長とを比較して音節境界を決定す
る。第2図に示した例では、A1の長さの方がB
1及びB2の長さより、平均音節長に近いた
め、長さA1の音節を選択して、時刻t2は音節境
界でないと判断される。
Now, in the example shown in Figure 2, if time t 2 is not a syllable boundary, a syllable of length t 3 - t 1 (length of A1 in the figure) exists in the time domain t 1 < t < t 3 . If it is a syllable boundary, the length t 2 −t 1
(the length of B1 in the figure) and length t 3 -t 2 (the length of B2 in the figure), but the syllable boundary selection unit 5 selects these syllable length candidates A1, B1, B2
and the average syllable length to determine syllable boundaries. In the example shown in Figure 2, the length of A1 is longer than B.
Since it is closer to the average syllable length than the lengths of A1 and B2, the syllable of length A1 is selected and time t2 is determined to be not a syllable boundary.

上記音節境界選択部5において行なわれる音節
境界の選択アルゴリズムをより一般化して以下に
説明する。
The syllable boundary selection algorithm carried out in the syllable boundary selection section 5 will be more generalized and explained below.

今、第3図に示すように、ある時間領域T1
t<T2において、音節境界の決定が困難なため、
音節境界検出部3がいくつかの音節候補列A,
B,C,…を作成して出力したとする(ただし、
音節候補列Aはa個の長さA1,A2,…,Aa
の音節候補から成り、音節候補列B,C,…も同
様とする)。
Now, as shown in FIG. 3, a certain time domain T 1 <
At t<T 2 , it is difficult to determine syllable boundaries, so
The syllable boundary detection unit 3 selects several syllable candidate sequences A,
Suppose that B, C, ... are created and output (however,
The syllable candidate string A has a length A1, A2, ..., Aa
(The same applies to syllable candidate sequences B, C, ...).

この音節候補列A,B,C,…が音節境界選択
部5に入力されて、音節候補A,B,C,…の平
均音節長からのずれDA,DB,DC,…がそれぞ
れ DA=1/aai=1 d(A(i),) DB=1/bbi=1 d(B(i),) DC=1/cci=1 d(C(i),) ただし、d(x,y)= |x−k1y|if長さx
の音節 の前に無音区間有 |x−k2y|if長さxの音節 の後に文節境界有 |x−y|if上記以外 として算出される。
These syllable candidate strings A, B, C, ... are input to the syllable boundary selection section 5, and the deviations D A , D B , D C, ... of the syllable candidates A , B , C , ... from the average syllable length are determined respectively. D A =1/a ai=1 d(A(i),) D B =1/b bi=1 d(B(i),) D C =1/c ci=1 d (C(i),) where d(x,y)= |x−k 1 y|if length x
There is a silent interval before the syllable of length x |x-k 2 y|if There is a clause boundary after the syllable of length x |x-y|if Calculated as other than the above.

ここで、文節の最初に来る音節や破裂音は平均
音節長より短くなることが多いため、0<k1
1と設定され、文節の終りの音節は長くなること
が多いため、k2>1と設定される。
Here, since syllables and plosives that come at the beginning of a phrase are often shorter than the average syllable length, 0<k 1 <
Since the syllable at the end of a phrase is often long, k 2 >1 is set.

音節境界選択部5は、上記のようにして算出さ
れた平均音節長からのずれDA,DB,DC,…の
中で最も小さな平均音節長からのずれを有する
音節候補列を選択して音節列として出力する。
The syllable boundary selection unit 5 selects a syllable candidate sequence having the smallest deviation from the average syllable length among the deviations D A , D B , D C , ... calculated as above from the average syllable length. output as a syllable string.

音節認識部6では、上記のようにして求められ
た音節区間に対して音節標準パターンメモリ7に
記憶された音節の標準パターンとマツチングを行
なつて認識結果を出力する。
The syllable recognition unit 6 matches the syllable section obtained as described above with the syllable standard pattern stored in the syllable standard pattern memory 7, and outputs a recognition result.

なお、上記実施例においては、音声入力装置を
使用する時に最初に既知の訓練用文章を発声して
平均音節長を算出するようにしたが、本発明
は、これに限定されることなく、例えば複数の話
者について予め平均音節長を算出して記憶してお
くように成してもよい。また同一話者における発
声速度の速い、普通、遅い状態における複数の平
均音節長を算出して記憶しておき、認識時の発
声状態により平均音節長を選択するようにしても
よい。
In the above embodiment, when using the voice input device, the known training sentence is first uttered to calculate the average syllable length, but the present invention is not limited to this, and the present invention can be The average syllable length may be calculated and stored in advance for a plurality of speakers. It is also possible to calculate and store a plurality of average syllable lengths in states of fast, normal, and slow speech rates for the same speaker, and select the average syllable length depending on the speech state at the time of recognition.

<効 果> 以上説明したように、本発明によれば、まず発
声速度を推定し、この推定した発声速度にもとず
いて音節境界が決定されるため、話者の特性等に
起因した入力音声の発声速度の相違に拘わらず、
正確に音節境界を検出決定することが出来る。
<Effects> As explained above, according to the present invention, the speech rate is first estimated, and the syllable boundaries are determined based on the estimated speech rate, so that inputs due to speaker characteristics etc. Regardless of the difference in speech rate,
Syllable boundaries can be detected and determined accurately.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明を実施した音声入力装置の構成
を示すブロツク図、第2図は検出された音節境界
の一例を示す図、第3図は検出された音節境界候
補の他の例を示す図である。 1…音声分析部、21…無音区間検出部、22
…有音区間検出部、23…発声速度推定部、3…
文節境界検出部、4…音節境界検出部、5…音節
境界選択部。
FIG. 1 is a block diagram showing the configuration of a voice input device embodying the present invention, FIG. 2 is a diagram showing an example of a detected syllable boundary, and FIG. 3 is a diagram showing another example of detected syllable boundary candidates. It is a diagram. 1... Voice analysis section, 21... Silent section detection section, 22
...Speech interval detection unit, 23...Speech rate estimation unit, 3...
Bunsetsu boundary detection section, 4... syllable boundary detection section, 5... syllable boundary selection section.

Claims (1)

【特許請求の範囲】 1 発声内容が既知である音声の有声区間におけ
る継続時間の総和をその音声に含まれる音節数で
割つた平均音節長を算出する発声速度推定部と、 認識すべき音声の音節境界を検出する音節境界
検出部と、 該音節境界検出部で検出された複数の音節境界
候補に対して前記発声速度推定部で算出された前
記平均音節長との類似度を求め、該類似度の最も
大きい候補を音節境界とする音節境界選択部と、 を有することを特徴とする音声入力装置。
[Scope of Claims] 1. A speech rate estimator that calculates the average syllable length by dividing the sum of durations in voiced sections of speech whose utterance content is known by the number of syllables included in the speech; A syllable boundary detection unit that detects syllable boundaries, and the average syllable length calculated by the speech rate estimation unit for the plurality of syllable boundary candidates detected by the syllable boundary detection unit, and the similarity. A voice input device comprising: a syllable boundary selection unit that determines a candidate with the highest degree as a syllable boundary;
JP58025069A 1983-02-16 1983-02-16 Syllable boundary selection system Granted JPS59149400A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58025069A JPS59149400A (en) 1983-02-16 1983-02-16 Syllable boundary selection system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58025069A JPS59149400A (en) 1983-02-16 1983-02-16 Syllable boundary selection system

Publications (2)

Publication Number Publication Date
JPS59149400A JPS59149400A (en) 1984-08-27
JPH0242238B2 true JPH0242238B2 (en) 1990-09-21

Family

ID=12155633

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58025069A Granted JPS59149400A (en) 1983-02-16 1983-02-16 Syllable boundary selection system

Country Status (1)

Country Link
JP (1) JPS59149400A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04115336U (en) * 1991-03-29 1992-10-13 ミツミ電機株式会社 Cassette holding frame of magnetic recording/reproducing device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS59180597A (en) * 1983-03-31 1984-10-13 富士通株式会社 Voice division system
JP2578771B2 (en) * 1986-08-26 1997-02-05 松下電器産業株式会社 Voice recognition device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5324205A (en) * 1976-08-18 1978-03-06 Nec Corp Voice reco gnition device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04115336U (en) * 1991-03-29 1992-10-13 ミツミ電機株式会社 Cassette holding frame of magnetic recording/reproducing device

Also Published As

Publication number Publication date
JPS59149400A (en) 1984-08-27

Similar Documents

Publication Publication Date Title
US8140330B2 (en) System and method for detecting repeated patterns in dialog systems
US6535850B1 (en) Smart training and smart scoring in SD speech recognition system with user defined vocabulary
US6317711B1 (en) Speech segment detection and word recognition
US20120239401A1 (en) Voice recognition system and voice recognition method
JP3069531B2 (en) Voice recognition method
Zolnay et al. Extraction methods of voicing feature for robust speech recognition.
CN106920558B (en) Keyword identification method and device
JPH0242238B2 (en)
JPH0222399B2 (en)
JP2001312293A (en) Voice recognition method and apparatus, and computer-readable storage medium
KR100350003B1 (en) A system for determining a word from a speech signal
JPH0217118B2 (en)
JP2006010739A (en) Voice recognition device
KR100597434B1 (en) Core Detector Using Modified Viterbi Algorithm and Beamwidth and Duration
KR20090068856A (en) Speech Verification Model and Speech Verification System Using Phoneme Level Log Likelihood Ratio Distribution and Phoneme Duration
JPH0772899A (en) Voice recognizer
KR100275446B1 (en) Method for choosing basic phoneme units using the phoneme recognition rate
JPH08314490A (en) Word spotting type speech recognition method and device
JPH0997095A (en) Voice recognition device
Takahashi et al. Isolated word recognition using pitch pattern information
JPS6147999A (en) Voice recognition system
JPS63217399A (en) Voice section detecting system
JPH05303391A (en) Speech recognition device
Ahmad et al. An isolated speech endpoint detector using multiple speech features
JP2891259B2 (en) Voice section detection device