JPH0652478B2

JPH0652478B2 - Recognition device

Info

Publication number: JPH0652478B2
Application number: JP57172786A
Authority: JP
Inventors: 文雄外川
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1982-09-30
Filing date: 1982-09-30
Publication date: 1994-07-06
Anticipated expiration: 2009-07-06
Also published as: JPS5961897A

Description

【発明の詳細な説明】＜技術分野＞本発明は認識装置の改良に関し、更に詳細には例えば文
節等の一区切りの音声等の一区切りの認識すべき情報を
音韻，かな，音節，文節等のより細分化された単位要素
で認識する認識装置の改良に関するものである。Description: TECHNICAL FIELD The present invention relates to an improvement in a recognition device, and more specifically, for example, a piece of information to be recognized, such as a segment of speech, such as a syllable, is converted into a phoneme, a kana, a syllable, a segment, or the like. The present invention relates to an improvement of a recognition device that recognizes subdivided unit elements.

＜従来技術＞文節等の一区切りの音声等を音韻，かな，音節等のより
細分化された単位で認識する場合、従来一般的には入力
された認識すべき一区切りの音声情報等を例えば音響処
理して音韻，音節等の単位毎の特徴ベクトル入力パター
ンを得ると共に、この入力パターンと予め記憶されてい
る標準パターンとのマッチングを行って、入力された情
報を候補単位列として類似度の高いものから出力し、こ
の出力された候補単位列と文節等の辞書の内容とを照合
して入力された情報に対する文節等の一区切りの情報を
認識している。<Prior Art> When recognizing a segment of speech such as a syllable in a more finely-divided unit such as a phoneme, a kana, or a syllable, generally, a segment of speech information or the like to be recognized that has been input is typically processed by acoustic processing. Then, a feature vector input pattern for each unit such as phonology or syllable is obtained, and the input pattern is matched with a standard pattern stored in advance, and the input information has a high degree of similarity as a candidate unit sequence. Then, the candidate unit sequence outputted is collated with the contents of the dictionary such as bunsetsu to recognize the delimiter information such as bunsetsu to the input information.

しかし、このような従来の方法によれば、全ての音韻，
音節等の標準パターンと入力パターンとのマッチングを
行なって類似度を算出し、類似度の高いものから順に候
補音節等として出力している。However, according to such a conventional method, all phonemes,
A standard pattern such as a syllable is matched with an input pattern to calculate a similarity, and the similarity is output as candidate syllables in order.

したがって、例えば拗音を含む単音節単位で認識する場
合、各音節単位全てについて１００種以上の単音節の標
準パターンと入力パターンとの間でマッチングを行う必
要があり、その処理時間が多大なものとなっていた。Therefore, for example, in the case of recognizing a single syllable unit including a jumble, it is necessary to perform matching between 100 or more types of standard patterns of single syllables and input patterns for each syllable unit, which requires a long processing time. Was becoming.

また、その後に類似度の高いものから出力される候補単
位列の全てについて辞書照合処理を行なう必要性があ
り、その処理時間が長くなり、正しい文節等を認識する
確度が向上せず、結果的に全体の認識に要する処理量が
膨大なものになっていた。In addition, after that, it is necessary to perform dictionary matching processing on all candidate unit strings output from the ones with a high degree of similarity, the processing time becomes long, and the accuracy of recognizing a correct phrase etc. does not improve, resulting in In addition, the amount of processing required for overall recognition was enormous.

＜目的＞本発明は、上記従来の欠点を除去した認識装置を提供す
ることを目的とし、正しい文節等の一区切りの認識すべ
き情報を認識する確度を向上させると共に、異なる課
題，分野等の異なる種類の認識すべき情報に応じた処理
を指定することが出来、結果的に全体の認識に要する処
理量を減少させることのできる認識装置を提供するもの
である。<Purpose> An object of the present invention is to provide a recognition device in which the above-mentioned conventional drawbacks are eliminated, and improve the accuracy of recognizing a segment of information to be recognized such as a correct phrase and different problems and fields. (EN) A recognition device capable of designating a process according to information of a type to be recognized and consequently reducing the amount of processing required for the entire recognition.

＜実施例＞以下、本発明の認識装置を文節等の一区切りの音声入力
を音節等のより細分化された単位要素で認識する場合の
例を実施例として説明する。<Example> Hereinafter, an example in which the recognition apparatus of the present invention recognizes a segmented speech input such as a syllable with a more subdivided unit element such as a syllable will be described as an example.

本発明の実施例によれば、文節等の一区切りの音声等の
認識すべき情報を音韻，かな，音節等のより細分化され
たＮ個の単位要素で認識する認識装置において、認識対
象となる文節あるいは文章等の文字（単位要素）列につ
いて、その話題あるいは分野毎に、(Ｎ＋１)個の文字
（単位要素）間の接続関係である遷移関係を記述した異
なる遷移行列を複数種類記憶した遷移行列記憶手段と、
この遷移行列記憶手段に記憶された複数種類の遷移行列
より、認識すべき文字列の内容（種類，分野等）に応じ
て所望の遷移行列を指定する遷移行列指定手段と、この
遷移行列指定手段により指定された遷移行列にもとずい
て、音節（単位要素）ラティス生成時に、一音節（単位
要素）前のどの候補音節（単位要素）からも遷移しない
音節（単位要素）群は認識対照から除外し、または及び
候補列作成時に各候補列に対して遷移行列を参照し、遷
移しない音節（単位要素）の組合せを含む候補列は除外
する等の認識処理を行う処理手段とを備えて、次の高次
の辞書照合の際の処理量の削減を図るように構成されて
いる。According to the embodiment of the present invention, the recognition device recognizes information to be recognized such as a segment of speech such as a syllable with N subdivided unit elements such as phonemes, kana, and syllables. Transitions that store multiple types of different transition matrices that describe transition relationships that are connection relationships between (N + 1) characters (unit elements) for each topic or field for character (unit element) strings such as clauses or sentences Matrix storage means,
From a plurality of types of transition matrices stored in the transition matrix storage means, a transition matrix designation means for designating a desired transition matrix according to the content (type, field, etc.) of the character string to be recognized, and the transition matrix designation means. A syllable (unit element) group that does not transition from any candidate syllable (unit element) before one syllable (unit element) based on the transition matrix specified by And a processing unit that performs recognition processing such as excluding or referring to the transition matrix for each candidate string when creating the candidate string, and excluding candidate strings that include combinations of syllables (unit elements) that do not transition, It is configured to reduce the processing amount in the next high-order dictionary matching.

まず、本発明の実施例の説明に先立ち、本発明の認識装
置に用いられる単位要素間の接続関係である遷移関係を
示した遷移行列について説明する。First, prior to the description of the embodiments of the present invention, a transition matrix showing a transition relationship which is a connection relationship between unit elements used in the recognition apparatus of the present invention will be described.

一般に日本語文章は、全てかな文字で表現した場合、か
な文字列に対応した音節列で表現できる。例えば文節
「地球の」は“ち”“きゅ”“う”“の”という４個の
単音節といわれる単位要素から成り立っている。２つの
音節間の接続関係（“ち”から“きゅ”，“きゅ”から
“う”，“う”から“の”）を、日本語全て、あるいは
特定の分野，話題における文章等について調べると接続
（遷移；以下遷移ということばを使う）しない音節対が
ある。例えばぱ行の音節の前には“ん”，“っ”以外は
こない。また“にゃ”は語頭にはこないし、“へ”（へ
と発生するもの）は語尾にこない。In general, a Japanese sentence can be represented by a syllable string corresponding to a kana character string when all the kana characters are expressed. For example, the phrase "Earth's" consists of four unit syllables called "chi", "kyu", "u", and "no". The connection relationship between two syllables (“chi” to “kyu”, “kyu” to “u”, “u” to “no”) in all Japanese, or in a particular field or topic When examined, there are syllable pairs that are not connected (transition; hereinafter, the term transition is used). For example, there is nothing other than "n" and "tsu" in front of the syllable of Payo. Also, "nya" does not come at the beginning of the word, and "he" (what happens to he) does not come at the end.

このような文節を構成する音節の１次の遷移関係を以下
に示す式(1)に従って記述して、第１図に示すような遷
移行列Ｍ（Ｘ，Ｙ）を作成する。A transition matrix M (X, Y) as shown in FIG. 1 is created by describing the first-order transition relation of syllables forming such a syllable according to the following equation (1).

第１図において遷移行列Ｍ（Ｍ，Ｙ）は単位要素列であ
る文字列の文字Ｘから次の文字Ｙへの遷移を記述したも
のであり、単位要素（音節）がＮ個の場合、(Ｎ+１)×
(Ｎ+１)の行列であり、ハード的にはＲＯＭ等に記憶さ
れる。またＹ_０列には各単位要素（１〜Ｎ）が節頭に来
るか否かを表わし、Ｘ_０行には各単位要素（１〜Ｎ）が
節尾に来るか否かを表わすデータが書込まれる。In FIG. 1, the transition matrix M (M, Y) describes the transition from the character X of the character string that is the unit element string to the next character Y, and when the number of unit elements (syllables) is N, ( N + 1) ×
This is an (N + 1) matrix and is stored in a ROM or the like in terms of hardware. The Y ₀ column indicates whether each unit element (1 to N) comes to the beginning of a node, and the X ₀ row contains data indicating whether each unit element (1 to N) comes to the beginning of a node. Written.

例えば“赤い”という文字列の遷移を遷移行列に書込ん
だ例を第２図に示す。遷移行列の要素は０（遷移不可
能）か１（遷移可能）の２値のどちらかで表現され、１
ビットで記憶される。なお、第２図においては表記
“１”以外の行列要素は全て“０”であり、その表示を
省略している。For example, FIG. 2 shows an example in which the transition of the character string "red" is written in the transition matrix. The elements of the transition matrix are represented by either binary values of 0 (non-transitional) and 1 (transitional), and 1
Stored in bits. In FIG. 2, all matrix elements other than the notation “1” are “0”, and the display thereof is omitted.

次に遷移行列の作成について、今少し詳細に説明する。The creation of the transition matrix will now be explained in some detail.

まず遷移行列の作成にあたって遷移行例メモリを“０”
に初期セット〔Ｍ(Ｘ,Ｙ)＝0〕する。First, when creating the transition matrix, the transition row example memory is set to "0".
Initially set [M (X, Y) = 0].

次に文字列但し、Ｉ：列の文字数とした場合、次式(1) に従って、文字列の文字遷移関係を遷移行列Ｍ（Ｘ，Ｙ）に書込む。同様
に認識対象となる文字列の全てについて遷移関係を書込
む遷移行列（１次）の作成を完了する。Then the string However, when I is the number of characters in the column, the following equation (1) According to the string Write the character transition relation of the above into the transition matrix M (X, Y). Similarly, the creation of the transition matrix (first order) in which the transition relationship is written for all the character strings to be recognized is completed.

このようにして作成された具体的な遷移行列（１次）Ｍ
（Ｘ，Ｙ）の例を第３図に示している。この第３図より
明らかなように例えば(Ｘ，Ｙ)＝（え，く）のビット位
置が“１”であるため、“え”から“く”への遷移が存
在し、また(Ｘ，Ｙ)＝(え，け)のビット位置が“０”で
あるため、“え”から“け”への遷移が存在しないこと
を表わしている。The concrete transition matrix (first order) M created in this way
An example of (X, Y) is shown in FIG. As is clear from FIG. 3, since the bit position of (X, Y) = (e, ku) is “1”, there is a transition from “e” to “ku”, and (X, Y) Since the bit position of (Y) = (e, ke) is “0”, it means that there is no transition from “e” to “ke”.

上記は１次の遷移であるが、２次遷移、更には一般にＭ
次へ拡張したＭ次遷移行列も同様に次式(2)に従って作
成することが出来る。The above is a first-order transition, but a second-order transition, and generally M
Similarly, the M-th order transition matrix expanded to the next can be created according to the following equation (2).

M次遷移行列:M(X₁,X₂,X₃,…,X_M,Y),(N+1)^M+1次元 M(a_i-M,a_i-(M-1),…,a_i)＝1,(i=1〜I+1)…(2) 本発明の実施例は、この遷移行列を認識対象の種類，話
題，分野等毎に複数個備え、必要に応じて特定の遷移行
列を選択して認識処理を実行し得るようにしたものであ
る。Mth-order transition matrix: M (X ₁ , X ₂ , X ₃ , ..., X _M , Y), (N + 1) ^{M + 1} dimensional M (a _iM , a _{i- (M-1)} , ..., a _i ) = 1, (i = 1 to I + 1) ... (2) In the embodiment of the present invention, a plurality of this transition matrix is provided for each type of recognition target, topic, field, etc., and a specific transition matrix can be selected as necessary to execute the recognition processing. .

次に本発明の実施例を図面を参照して説明する。Next, an embodiment of the present invention will be described with reference to the drawings.

第４図は本発明の一実施例装置の構成を示すブロック図
である。FIG. 4 is a block diagram showing the configuration of an apparatus according to an embodiment of the present invention.

第４図において、１は遷移行列指定手段であり、該指定
手段１は中央処理装置（ＣＰＵ）に接続されており、操
作面に設けた選択キーあるいは音声による選択入力手段
により構成される。また３は認識すべき音声情報の入力
される入力部、４は増幅部、５は音響処理部、６１，６
２，…，６Ｋはそれぞれ異なった種類の遷移行列を記憶
する遷移行列記憶手段、７は認識処理部である。In FIG. 4, reference numeral 1 is a transition matrix designating means, which is connected to a central processing unit (CPU) and is constituted by a selection key provided on the operation surface or a selection input means by voice. Further, 3 is an input unit for inputting voice information to be recognized, 4 is an amplification unit, 5 is an acoustic processing unit, and 61, 6
2, ..., 6K are transition matrix storage means for storing different types of transition matrices, and 7 is a recognition processing unit.

上記の如き構成において遷移行列記憶手段６１，６２，
…，６Ｋにはそれぞれに異なる分野（例えば科学，文
学，経済等）の文章等から作成された異なる種類の遷移
行列が記臆されており、今入力部３に入力される音声情
報が例えば科学関係のものであれば、遷移行列指定手段
１を操作して科学関係の文章等から作成された遷移行列
記憶手段（例えばＭ１）を選択指定し、この選択指定し
てメモリＭ１に記憶している遷移行列を用いて認識処理
部７で認識処理動作が行なわれる。In the above configuration, the transition matrix storage means 61, 62,
.., 6K have different types of transition matrices created from sentences in different fields (for example, science, literature, economy, etc.), and the voice information input to the input unit 3 now is, for example, science. If it is related, the transition matrix designating means 1 is operated to select and designate a transition matrix storing means (for example, M1) created from a scientific text or the like, and the selection and designation is stored in the memory M1. The recognition processing operation is performed by the recognition processing unit 7 using the transition matrix.

次に上記のようにして認識すべき情報の種類（分野）に
応じて選択指定された遷移行列を用いた認識動作につい
て説明する。Next, the recognition operation using the transition matrix selected and designated according to the type (field) of information to be recognized as described above will be described.

第５図は上記第４図に示した音響処理部５及び認識処理
部７の詳細ブロック図である。FIG. 5 is a detailed block diagram of the acoustic processing unit 5 and the recognition processing unit 7 shown in FIG.

第５図において、文節音声入力部２１に入力された音声
情報は次段の音響距離・比較部２２に入力される。この
音響処理・比較部２２は遷移行列メモリ２６を用いた処
理部分を除いて従来公知のものであり、例えば文節音声
入力部２１に入力された文節音声信号が音響処理部２２
により単音節毎に特徴抽出処理が行なわれ、各単音節毎
の特徴パターンが同処理部２２内のバッファに一時記憶
される。一方記憶装置２３には各単音節毎の標準パター
ンＰ_i（i＝1〜N）が記憶されており、この標準パターン
Ｐ_ｉが順次読出されて処理・比較部22において該処理部
内のバッファに記憶された入力音声の入力特徴パターン
とのマッチング計算が行なわれる。In FIG. 5, the voice information input to the phrase voice input unit 21 is input to the next acoustic distance / comparison unit 22. The acoustic processing / comparing unit 22 is a conventionally known one except for the processing unit using the transition matrix memory 26. For example, the phrase speech signal input to the phrase speech input unit 21 is the acoustic processing unit 22.
The feature extraction process is performed for each single syllable, and the feature pattern for each single syllable is temporarily stored in the buffer in the processing unit 22. On the other hand, the storage device 23 stores a standard pattern P _i (i = 1 to N) for each monosyllabic, and the standard pattern P _i is sequentially read and stored in a buffer in the processing unit in the processing / comparing unit 22. A matching calculation of the stored input voice with the input feature pattern is performed.

従来技術によれば、この標準パターンと入力特徴パター
ンとのマッチング計算処理は全ての標準パターンについ
て行なわれていたが、本実施例によれば、後述するよう
に遷移行列メモリ２６に記憶された情報にもとずいて前
に候補として認識した音節に接続可能な音節（最初の場
合は先頭に来る可能性のある音節）の標準パターンとの
マッチングが計算され、最も近似したものが第１候補と
して、また順次近似したものが次候補として選出され、
その結果が候補音節メモリ２４に記憶される。即ち、音
節ラティス生成時に、一音節前のどの候補音節からも遷
移しない音節群は認識対照から除外するように処理され
る。According to the conventional technique, the matching calculation processing of the standard pattern and the input feature pattern is performed for all the standard patterns, but according to the present embodiment, the information stored in the transition matrix memory 26 is described later. The matching with the standard pattern of the syllable that can be connected to the syllable recognized as a candidate (the syllable that may come first in the first case) is calculated, and the closest match is the first candidate. , Which are successively approximated are selected as the next candidates,
The result is stored in the candidate syllable memory 24. That is, when the syllable lattice is generated, a syllable group that does not transition from any candidate syllable one syllable before is processed to be excluded from the recognition contrast.

なお、遷移行列メモリ２６は遷移行列指定手段１によっ
て指定された遷移行列記憶手段６１，６２，…，６Ｋの
一つのメモリ（M₁）に対応したものである。The transition matrix memory 26 corresponds to one memory (M ₁ ) of the transition matrix storage means 61, 62, ..., 6K designated by the transition matrix designation means 1.

上記候補音節ラティスメモリ２４に記憶された複数個の
候補音節の時系列は候補列作成部２５及び遷移行列メモ
リ２６より成る候補列出力部２７に入力され、該候補列
出力部２７において、特定の話題，分野等に対応した遷
移行列メモリ２６の内容を参照して遷移不可能な音節遷
移を含む候補列は除外して、遷移可能な候補列のみ、信
頼度の高い組合せ順に作成され、この候補列と辞書２８
に記憶された文節とが辞書照合部２９により照合され、
一致すればその結果が文節出力部３０に出力されるよう
に構成されている。The time series of the plurality of candidate syllables stored in the candidate syllable lattice memory 24 is input to the candidate sequence output unit 27 including the candidate sequence creation unit 25 and the transition matrix memory 26, and the candidate sequence output unit 27 specifies a specific sequence. By referring to the contents of the transition matrix memory 26 corresponding to a topic, a field, etc., a candidate sequence including syllable transitions that cannot be transitioned is excluded, and only transitionable candidate sequences are created in order of high reliability combination. Columns and dictionaries 28
The dictionary matching unit 29 matches the phrase stored in
If they match, the result is output to the phrase output unit 30.

次に遷移行列Ｍ（Ｘ，Ｙ）を用いた音節認識処理につい
て第６図に示す遷移行列を用いた候補音節作成処理ブロ
ック図を参照して説明する。Next, the syllable recognition process using the transition matrix M (X, Y) will be described with reference to the block diagram of the candidate syllable creation process using the transition matrix shown in FIG.

本実施例においては、結果として得る候補音節を時系列
順に候補音節ラティスバッファ２４に一時記憶する。ま
た上記した遷移行列情報はメモリ２６に記憶されてお
り、音節標準パターンはメモリ２３に記憶されている。In this embodiment, the resulting candidate syllables are temporarily stored in the candidate syllable lattice buffer 24 in chronological order. The transition matrix information described above is stored in the memory 26, and the syllable standard pattern is stored in the memory 23.

候補音節ラティス２４には認識結果が次表の如く記憶さ
れていくが今、第ｉ音節を認識する場合には、以下の如
く処理が実行される。The recognition result is stored in the candidate syllable lattice 24 as shown in the following table, but when recognizing the i-th syllable, the following processing is executed.

但 J_(i)：第ｉ音節候補数Ｓｉｊ：第ｊ音節ｉ候補音節番号今、前音節候補を X＝｛Ｓ_i-1,j｝ j＝1〜J（i-1）組合せ数：J（i-1）（l＝0のとき S_l,j＝0）とした場合、次式(3)に従って直前の複数個（J(i-1)
個）の候補音節について遷移行列の和をとり、得られた
行ｍ(Y)が０である音節は遷移不可能であると指定す
る。 However, J _(i) : i-th syllable candidate number Sij: j-th syllable i candidate syllable number Now, the previous syllable candidate is X = {S _{i-1, j} } j = 1 to J (i-1) Number of combinations: J If (i-1) (S _{l, j} = 0 when _l = 0), the immediately preceding plural number (J (i-1)
Number) of candidate syllables, the transition matrix is summed, and the syllable in which the obtained row m (Y) is 0 is designated as not transitionable.

m(Y)＝VM(S_i-1,j, Y) ………(3) ＝M(S_i-1,1,Y)+M(S_i-1,2, Y)+…+ M(S_i-1,J(i-1),Y) この(3)式においてm(Y)＝０となり、遷移不可能と指定
された音節群は、除外して次の類似比較の処理を行い、
第ｉ音節の候補音節を出力し、候補音節ラティス７に書
込む。但し、ｉ＝１（節頭の音節）のときは第０行Ｍ
（０，Ｙ）によって遷移不可能と指定された音節群を除
外して類似比較の処理を行なう。m (Y) ＝ VM (S _{i-1, j,} Y) ……… (3) ＝ M (S _i-1,1, Y) + M (S _i-1,2, Y) +… + M (S _{i-1, J (i-1),} Y) In this equation (3), m (Y) = 0 and the syllable group designated as non-transitionable is excluded, and the next similarity comparison process is performed. Done,
The candidate syllable of the i-th syllable is output and written in the candidate syllable lattice 7. However, when i = 1 (syllable at the beginning of the syllable), the 0th row M
A syllable group designated by (0, Y) as a transition impossible is excluded, and the similarity comparison process is performed.

以上を繰返して、一文節音声の候補音節ラティスの作成
を完了する。By repeating the above, the creation of the candidate syllable lattice of the one-sentence voice is completed.

今、一文節音声として「国民は」を入力した場合、音響
処理部２２により音節毎に特徴抽出が行なわれ、その音
節毎の特徴パターンが入力パターン時系列バッファ３１に記憶される。次に
遷移行列を用いた候補音節作成処理に移り、最初に第１
音節の特徴パターンがが入力パターンバッファ３２に読み込まれ、次にステッ
プｎ３に移行して前候補音節群により式(3)にしたがっ
て遷移行列の行を指定する。最初の場合はステップｎ４
において第０行のＭ（０，Ｙ）が指定されその内容がバ
ッファ３３に一時記憶され、ステップｎ５の生起音節の
指定が成される。Now, when "national" is input as one syllable voice, the acoustic processing unit 22 performs feature extraction for each syllable, and the feature pattern for each syllable. Are stored in the input pattern time series buffer 31. Next, the process shifts to the candidate syllable creation process using the transition matrix, and first the first
Syllable feature pattern Is read into the input pattern buffer 32, and then the process proceeds to step n3 to specify the row of the transition matrix by the previous candidate syllable group according to the equation (3). Step n4 for the first case
At M, M (0, Y) in the 0th line is designated, the content is temporarily stored in the buffer 33, and the occurrence syllable is designated in step n5.

次にステップｎ６に移行して入力パターンバッファ３２
に記憶された第１音節の特徴パターンがロードされ、この特徴パターンと音節標準パターンメモリ２３に記憶された標準パター
ンの内バッファ３３によって生起音節と指定されて順次
標準パターンバッファ３４に読出される標準パターンと
の間で類似比較が行なわれ（ステップｎ７）、その結果
にもとずいて候補音節が出力され（ステップｎ８）、そ
の結果が候補音節ラティス２４に書かれる。この実施例
においては第１音節候補として“ＫＯ”，“ＧＯ”，
“ＢＯ”が記憶される。Next, in step n6, the input pattern buffer 32
First syllable stored in This feature pattern is loaded. And the standard pattern stored in the syllable standard pattern memory 23 are compared with the standard pattern designated by the buffer 33 of the standard pattern and sequentially read out to the standard pattern buffer 34 (step n7). Based on this, the candidate syllable is output (step n8), and the result is written in the candidate syllable lattice 24. In this embodiment, the first syllable candidates are "KO", "GO",
“BO” is stored.

次にステップｎ２に戻り、第２音節特徴パターンがバッファ３２に入力され、ステップｎ３に移行して、
候補音節ラティス２４の第１候補音節にもとずいて“Ｋ
Ｏ”，“ＧＯ”，“ＢＯ”に対応した各行のＭ(Ｓ
_1,1〜3,Ｙ)が指定され、ステップｎ４において、その遷
移行列の和（ＯＲ）が作成されてその結果がバッファ３
３に一時記憶され、ステップｎ５の生起音節の指定が成
される。Next, returning to step n2, the second syllable feature pattern Is input to the buffer 32, the process proceeds to step n3,
Candidate Syllables Based on the first candidate syllable of Lattice24, "K
M (S of each row corresponding to "O", "GO", "BO"
₁ , 1 to ₃ , Y) are specified, and in step n4, the sum (OR) of the transition matrices is created and the result is stored in the buffer 3
3 is temporarily stored, and the occurrence syllable is designated in step n5.

次にステップｎ６に移行し、以下同様のステップｎ６〜
ｎ９を実行して第２候補音節“ＫＵ”，“ＧＵ”をメモ
リ２４に記憶する。Next, the process proceeds to step n6, and the same steps n6 to
n9 is executed to store the second candidate syllables “KU” and “GU” in the memory 24.

以上の動作を繰返して一文節の候補音節ラティスの作成
を完了する。The above operation is repeated to complete the creation of the one-syllable candidate syllable lattice.

以上のようにして候補音節ラティス２４に候補例が記憶
されることになるが、遷移行列を用いない場合の従来方
式の場合と本方式の場合の実例を入力音声「国民は」に
ついて次表に示す。As described above, the candidate examples are stored in the candidate syllable lattice 24. However, an example of the case of the conventional method without using the transition matrix and an example of the case of this method are shown in the following table for the input voice "Kuniwa". Show.

上記の例から明らかなように、本方式による方が正しい
文字列が候補列の上位に上がっている様子がわかる。 As is clear from the above example, it can be seen that the correct character string is higher in the candidate string in this method.

以上の遷移行列は１次遷移であるが、２次遷移、更には
一般的なＭ次遷移まで同じ手法で拡張することができ
る。Although the above transition matrix is a first-order transition, it can be extended to a second-order transition and further to a general M-th order transition by the same method.

なおＭ次の遷移行列の作成は上述の式(2)に従い、前候
補音節（Ｍ音節前まで）からの音節指定は次に示す式
(4)によって行なうことが出来る。Note that the M-th order transition matrix is created according to the above equation (2), and the syllable designation from the previous candidate syllable (up to M syllables before) is given by the following equation.
It can be done by (4).

即ちＭ次遷移行列Ｍ(Ｘ_１,Ｘ_２,…,Ｘ_Ｍ,Ｙ)への拡張の
場合、前音節候補列を組合せの数：J(i-M)・J(i-(M-1))…J(i-1) (l0のとき S_l,j＝0) とした場合、音節指定はによって行なうことになる。That is, in the case of extension to the Mth-order transition matrix M (X ₁ , X ₂ , ..., X _M , Y), the preceding syllable candidate sequence is Number of combinations: J (iM) ・ J (i- (M-1))… J (i-1) (Sl _{, j} = 0 when l0) It will be done by.

なお、Ｍの次数を大きくとれば、生成音節の限定が強く
なり効果はより大きくなる。It should be noted that if the order of M is increased, the limitation of the generated syllable becomes stronger and the effect becomes larger.

次に上記候補列出力部２７で実行されている遷移行列を
用いた候補音節列作成動作について、第７図に示す遷移
行列を用いた候補列作成の処理ブロック図を参照して説
明する。Next, the operation of creating a candidate syllable string using the transition matrix executed by the candidate string output unit 27 will be described with reference to the processing block diagram of creating the candidate string using the transition matrix shown in FIG.

上記第５図に示した音響処理・比較部２２から出力され
た複数個の候補音節の時系列を記憶する候補音節ラティ
スメモリ２４の内容をもとに、候補音節列作成部４１に
おいて信頼度の高い順に候補列が作成され、その結果が
候補音節列バッファ４２に一次記憶される。この候補音
節列バッファ４２に記憶された候補音節列は遷移行列参
照部４３においてメモリ２６に記憶された遷移行列：M
(X,Ｙ) を参照して、遷移可能か不可能かを次式(5)によ
って判定部４４において判定し、可能な候補列のみ候補
列書込み部４５を介して候補音節列出力バッファ４６に
記憶していく。Based on the contents of the candidate syllable lattice memory 24, which stores the time series of the plurality of candidate syllables output from the acoustic processing / comparing unit 22 shown in FIG. The candidate strings are created in descending order, and the results are temporarily stored in the candidate syllable string buffer 42. The candidate syllable sequence stored in the candidate syllable sequence buffer 42 is the transition matrix: M stored in the memory 26 in the transition matrix reference unit 43.
With reference to (X, Y), the determination unit 44 determines whether transition is possible or not according to the following equation (5), and only the possible candidate strings are stored in the candidate syllable string output buffer 46 via the candidate string writing unit 45. I will remember.

今第ｉ番目の候補音節列を但し、a_i：第ｉ番目の音節番号Ｉ：列の音節数とした場合、判定部４４による遷移行列M(X,Y)を用いた
候補列否定はのいずれか一つが成立した場合に成される。Now the i-th candidate syllable string However, when a _{i is} the i-th syllable number I is the number of syllables in the sequence, the candidate sequence negation using the transition matrix M (X, Y) by the determination unit 44 is It is done when any one of the above is established.

この(5)式において、いずれか一つが成立した遷移不可
能な音節列を含んだ候補音節列は除外され、次の候補音
節列について同様の判定を行ない、遷移可能な候補音節
列のみが出力バッファ４６に記憶される。今、一文節音
声として「国民は」を入力した場合、音響処理・比較部
２の処理により候補音節ラティスメモリ４に次表の如き
候補音節が時系列に記憶される。In this equation (5), candidate syllable strings that include non-transitionable syllable strings for which any one of the conditions holds are excluded, the same determination is performed for the next candidate syllable string, and only transitionable candidate syllable strings are output. It is stored in the buffer 46. Now, when "national" is input as the one-syllable voice, the candidate syllables as shown in the following table are chronologically stored in the candidate syllable lattice memory 4 by the processing of the sound processing / comparing unit 2.

このメモリ２４に記憶された音節ラティスを基に、信頼
度の高い順に候補列が作成され、遷移行列・Ｍ(Ｘ，Ｙ)
を参照して作成された候補列が遷移可能なもののみが出
力され、この例の場合には候補音節列が次の如く出力さ
れる。 Based on the syllable lattice stored in the memory 24, candidate sequences are created in the order of high reliability, and the transition matrix · M (X, Y)
Only those candidate strings that can be transitioned are created and output in this example. In this example, candidate syllable strings are output as follows.

遷移行列を参照しない従来方式によれば信頼度の最も高
い候補列として「GOKUPINWA」が出力されることになる
が、本方式によれば、この候補列の音節の遷移例えば
“ＫＵ”から“ＰＩ”が遷移不可能であると遷移行列：
Ｍ(Ｘ，Ｙ)を用いて判断され、以後の辞書照合処理から
除外される。 According to the conventional method that does not refer to the transition matrix, "GOKUPINWA" is output as the candidate sequence having the highest reliability, but according to this method, the transition of the syllables of this candidate sequence, for example, from "KU" to "PI If "is not transitionable, the transition matrix:
It is determined using M (X, Y) and is excluded from the subsequent dictionary matching process.

なおＭ次の遷移行列の作成は上述の式(2)に従い、候補
音節列の否定は次に示す式(6)によって行うことが出来
る。It should be noted that the M-th order transition matrix can be created according to the above equation (2), and the negation of the candidate syllable sequence can be performed by the following equation (6).

即ち、Ｍ次遷移行列：Ｍ(Ｘ₁,Ｘ₂,…,Ｘ_M,Ｙ)への拡張
部の場合、第ｊ候補列をとすると M(a^i-M,a_i-(M-1),…,a_i)＝0 (i＝1〜I+1)…(6) （但し l0，l＞1のときa_l＝0）のいずれか一つが成立した場合に否定が成される。な
お、Ｍの次数を大きくとれば、候補音節列の限定が強く
なり、効果はより大きくなる。That is, in the case of the extension part to the Mth-order transition matrix: M (X ₁ , X ₂ , ..., X _M , Y), the j-th candidate sequence is Then, M (a ^iM , a _{i- (M-1)} , ..., a _i ) = 0 (i = 1 to I + 1)… (6) (However, if l0 and l> 1, a _l = 0) If any one of the above is established, the denial is made. Note that if the order of M is increased, the limitation of the candidate syllable string becomes stronger, and the effect becomes larger.

以上のようにして、候補列作成時に、各候補列に対して
行列Ｍを参照し、遷移しない音節の組合せを含む候補列
は除外されることになる。As described above, at the time of creating a candidate string, the matrix M is referred to for each candidate string, and the candidate string including a combination of syllables that do not transition is excluded.

上記した認識装置の認識対象は文節に限らず、音節，単
語，文章でもよく、また細分化された単位は音節に限ら
ず、音韻，単語でもよい。The recognition target of the above-described recognition device is not limited to syllables, and may be syllables, words, and sentences, and the subdivided units are not limited to syllables, but may be phonemes and words.

またアルファベット等の文字列あるいはFORTRAN 言語等
のプログラム言語の文字列でもよい。It may be a character string such as an alphabet or a character string of a programming language such as FORTRAN language.

一般に認識対象語を構成する細分化した単位の遷移関係
の存在する文字列であれば、本発明を適用することが出
来る。In general, the present invention can be applied to any character string having a transitional relationship of subdivided units forming a recognition target word.

＜効果＞以上の如く、本発明によれば、確度高く正しい候補列を
抽出することが出来るため、正しい文節等を認識する確
度が高くなり、結果的に高次の辞書照合等の処理量を減
少させることが出来ると共に、認識すべき情報の種類，
内容，話題，分野等に応じて、その都度必要に応じて話
題，分野別等の遷移行列を任意に選択指定して用いるこ
とが出来るため、遷移行列を用いた認識処理の効果をよ
り大きくすることが可能である。<Effect> As described above, according to the present invention, since a correct candidate sequence can be extracted with high accuracy, the accuracy of recognizing a correct phrase or the like is increased, and as a result, the processing amount of high-order dictionary matching or the like is increased. The type of information that can be reduced,
Depending on the content, topic, field, etc., the transition matrix of topic, field, etc. can be arbitrarily selected and used as needed each time, so that the effect of the recognition process using the transition matrix is further increased. It is possible.

なお、本発明において、話題毎の文章や文節について作
成したような同次数の異なる種類の遷移行列；Ｍ_ｉ，Ｍ
_ｊから、それ等の和をとって合成することにより、簡単
に新しい遷移行列；Ｍ(Ｍ＝Ｍ_iUM_i )を作成することが
出来る。In the present invention, transition matrices of different kinds having different homogeneities such as those created for sentences and phrases for each topic; M _i , M
_A new transition matrix; M (M = M _i UM _i ) can be easily created by taking the sum of _j and combining them.

[Brief description of drawings]

第１図は１次遷移行列を示す図、第２図は文字列の遷移
を書込んだ遷移行列を示す図、第３図は文節文字列の遷
移行列例を示す図、第４図は本発明を実施した認識装置
の一実施例の構成を示すブロック図、第５図は遷移行列
を用いた認識処理部の詳細ブロック図、第６図は遷移行
列を用いた候補音節作成の処理フロー図、第７図は遷移
行列を用いた候補列作成の処理ブロック図である。１……遷移行列指定手段、２……中央処理装置(CPU)、
６１，６２，…，６Ｋ……遷移行列記憶手段、７……認
識処理部。FIG. 1 is a diagram showing a first-order transition matrix, FIG. 2 is a diagram showing a transition matrix in which transitions of character strings are written, FIG. 3 is a diagram showing an example of a transition matrix of phrase strings, and FIG. 4 is a book. FIG. 5 is a block diagram showing a configuration of an embodiment of a recognition device embodying the invention, FIG. 5 is a detailed block diagram of a recognition processing section using a transition matrix, and FIG. 6 is a processing flow chart of candidate syllable creation using a transition matrix. FIG. 7 is a processing block diagram of creating a candidate sequence using a transition matrix. 1 ... Transition matrix designating means, 2 ... Central processing unit (CPU),
61, 62, ..., 6K ... Transition matrix storage means, 7 ... Recognition processing unit.

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開昭56−29299（ＪＰ，Ａ) 中田和男著「音声の合成と認識」（昭55 −７−20）総合電子出版，Ｐ．124〜127 電子通信学会パターン認識と学習研究会資料ＰＲＬ75−55（昭和50年）Ｐ．11〜20 電子通信学会パターン認識と学習研究会資料ＰＲＬ73−72（1973年11月）Ｐ．12 ─────────────────────────────────────────────────── ─── Continuation of the front page (56) References Japanese Patent Laid-Open No. 56-29299 (JP, A) Kazuo Nakata, “Synthesis and Recognition of Speech” (Sho 55-7-20) General Electronic Publishing, P. 124-127 IEICE Pattern Recognition and Learning Study Material PRL75-55 (1975) P. 11-20 IEICE Technical Committee on Pattern Recognition and Learning Material PRL73-72 (November 1973) P. 12

Claims

[Claims]

1. An apparatus for recognizing a single segment of information to be recognized by N more subdivided unit elements, wherein (N + 1) unit elements can be connected for a predetermined unit element sequence to be recognized. A plurality of different transition matrix memories that store information on whether or not there is a transition matrix memory designating unit that designates a predetermined transition matrix memory from among the plurality of types of transition matrix memories; and the transition matrix memory designating unit. A recognition device, comprising: a processing unit that performs recognition processing based on data of a designated transition matrix memory.