JPS5949596B2

JPS5949596B2 - Audio parameter playback control method

Info

Publication number: JPS5949596B2
Application number: JP55110791A
Authority: JP
Inventors: 稔黒田; 勝行二矢田; 省二平岡
Original assignee: Matsushita Electric Industrial Co Ltd; Matsushita Electric Works Ltd
Current assignee: Panasonic Electric Works Co Ltd; Panasonic Holdings Corp
Priority date: 1980-08-12
Filing date: 1980-08-12
Publication date: 1984-12-04
Also published as: JPS5735897A

Description

【発明の詳細な説明】本発明はいわゆるＰＡＲＣＯＲ型音声合成装置における
音声パラメータの再生制御方式に関するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a reproduction control method for audio parameters in a so-called PARCOR type speech synthesizer.

一般に音声の特徴を表わすパラメータには、音の大小を
表わす振幅パラメータと、音の高低すなわち基本周期を
表わすピツチパラメータと、音の音色、すなわちスペク
トル分布を表わすスベクトルパラメータとがある。In general, parameters representing the characteristics of a sound include an amplitude parameter representing the magnitude of the sound, a pitch parameter representing the pitch of the sound, that is, the fundamental period, and a vector parameter representing the timbre of the sound, that is, the spectral distribution.

従来音声のスペクトル分布を表現するたみに、第１図に
示すように音声信号の標本値Ｘ，と、これよりＰ個離れ
た標本値Ｘ，− ｐとの自己相関係数Ｓｐを用いる自己
相関係数方式というものが開発されている。しかしなが
ら、自己相関係数ＳｐにはＸｔ（５ｘｔ−ｐの間にある
（ｐ−１）個の標本値による相関関係も含まれているの
で、冗長度が大きく帯域圧縮率が悪いという欠点があつ
た。そこで、ＸｔとＸｔ− ｐの間にある（ｐ−１）個
の標本値による相関関係を除外してＸｔとＸ，− ｐと
の相関関係のみを抽出するようにしたものがＰＡＲＣＯ
Ｒ係数（部分自己相関関係数）Ｋｐである。さらに詳し
く説明すれば、ＰＡＲＣＯＲ係数とは、「音声がほぼ定
常状態とみなせる１フレーム（約２０ｍＳ）において、
ある時間間隔（約１００μＳ）毎に音声信号のサンプリ
ングを行ない、隣り合うサンプル値間の相関係数をＫ，
とし、複数間隔離れたサンプル値間では、その間に挾ま
れたサンプル値による影響を最小２乗予測によつて求め
、それらを差引いてできる相関係数Ｋ２〜ＫＩＯとした
もの」である。なお、通常のＰＡＲＯＣＲ型音声合成装
置においては、振幅パラメータ、ピツチパラメータ、な
らびにＰＡＲＣＯＲ係数をそれぞれＡパラメータ、Ｐパ
ラメータ、Ｋパラメータと略称しているので本発明の以
下の説明においてはこの略称を用いることにする。とこ
ろで、ＰＡＲＣＯＲ係数ＫｐのうちＫ，，Ｋ２，Ｋ。の
ようにＸｔに近い点との部分自己相関関係を表わす係数
にはスペクトル分布に関する情報が豊富に含まれている
が、Ｋ８，Ｋ９，Ｋ，ＯのようなＸｔから遠い点との部
分自己相関係数にはスペクトル分布に関する情報があま
り含まれていない。そこで、従来Ｋｌ，Ｋ２，Ｋ３のよ
うな低次のＫパラメータには多数の量子化ビツトを割り
当て、Ｋ８，Ｋ９，Ｋ，Ｏのような高次のパラメータに
は少数の量子化ビツトを割り当て、Ｋ，，以降はもはや
伝送しなくてもスペクトル分布を充分正確に再現するこ
とがでさることが知られている。したがつてＰＡＲＣＯ
Ｒ方式はＳ１〜ＳＩＯの各係数について同一ビツト数ず
つ必要とする自己相関係数方式に比べて帯域圧縮率がす
ぐれているものである。本発明およびその併合発明はか
かるＰＡＲＣＯＲ型音声合成装置において、特徴パラメ
ータを線形あるいは非線形に圧縮された形で記録すると
共に合成装置においてこの圧縮パラメータから元の特徴
パラメータを再生するきわめて簡易な再生制御方式を提
供することを第１の目的とするものであり、次にかかる
簡易な再生制御方式を用いたＰＡＲＣＯＲ型音声合成装
置を時報装置、警報装置、目覚時計などに応用するにあ
たつて、システムをデータ記録部を含む匍脚用ＬＳＩと
音声合成用ＬＳＩとの２チツプ構成にして音声合成用Ｌ
ＳＩに汎用性をもたせる場合に、記録再生されるメツー
ジの音質や長さをＲＯＭ部の変更のみで任意に選択でき
るようにすると共に匍脚用ＬＳＩと音声合成用ＬＳＩと
の間のデータの受け渡しおよびＬＳＩ内部の演算をビツ
トシリアルに行なうことによりＬＳＩパツケージのピン
数を節減し組立を容易にすると共にチツプ面積を小さく
することを第２の目的とするものであり、最後に特に音
質に重大な影響を与える低次のＫパラメータの頻度の高
い部分を特に細かく分割して非線形圧縮を行なうことに
より一層冗長度を低くすると共に上記圧縮されたパラメ
ータから元の特徴パラメータを１対１に対応させて再生
するための再生用ＲＯＭのデータを異なる圧縮パラメー
タ間にて共用することにより、上記再生用ＲＯＭの記憶
容量を削減することを第３の目的とするものである。以
下本発明およびその併合発明の構成を図示実施例につい
て説明する。Conventionally, in order to express the spectral distribution of speech, as shown in Fig. 1, an autocorrelation coefficient Sp between a sample value X of the speech signal and a sample value A correlation coefficient method has been developed. However, since the autocorrelation coefficient Sp also includes correlations due to (p-1) sample values between Xt(5xt-p), it has the disadvantage of high redundancy and poor band compression rate Therefore, PARCO extracts only the correlation between Xt and X, -p, excluding the correlation between (p-1) sample values between Xt and
R coefficient (partial autocorrelation coefficient) Kp. To explain in more detail, the PARCOR coefficient is defined as ``In one frame (approximately 20 mS) where the audio can be considered to be in an almost steady state,
The audio signal is sampled at certain time intervals (approximately 100 μS), and the correlation coefficient between adjacent sample values is expressed as K,
Between sample values separated by a plurality of intervals, the influence of the sample values sandwiched between them is determined by least squares prediction, and the correlation coefficients K2 to KIO are obtained by subtracting them. Note that in a normal PAROCR type speech synthesizer, the amplitude parameter, pitch parameter, and PARCOR coefficient are abbreviated as A parameter, P parameter, and K parameter, respectively, so these abbreviations will be used in the following description of the present invention. Make it. By the way, among the PARCOR coefficients Kp, K,,K2,K. The coefficients representing the partial autocorrelation with points close to Xt, such as The relationship number does not contain much information about the spectral distribution. Therefore, conventionally, a large number of quantization bits are assigned to low-order K parameters such as Kl, K2, and K3, and a small number of quantization bits are assigned to high-order parameters such as K8, K9, K, and O. It is known that after K, the spectral distribution can be reproduced with sufficient accuracy without transmitting any more. Therefore, PARCO
The R method has a better band compression rate than the autocorrelation coefficient method which requires the same number of bits for each of the coefficients S1 to SIO. The present invention and its combined inventions provide an extremely simple playback control method in which feature parameters are recorded in a linearly or non-linearly compressed form and the synthesizer reproduces the original feature parameters from the compressed parameters in such a PARCOR type speech synthesizer. The first purpose is to provide a system for applying a PARCOR type speech synthesizer using such a simple playback control method to a time signal device, an alarm device, an alarm clock, etc. The LSI for speech synthesis is made into a two-chip structure consisting of an LSI for the prongs including a data recording section and an LSI for speech synthesis.
When providing versatility to the SI, it is possible to arbitrarily select the sound quality and length of the recorded and reproduced metsuji simply by changing the ROM section, and also to transfer data between the LSI for the torpedo and the LSI for speech synthesis. The second objective is to reduce the number of pins in the LSI package, simplify assembly, and reduce the chip area by performing internal LSI calculations bit-serially. Redundancy is further reduced by subdividing frequently-frequent parts of low-order K parameters that have an impact and performing non-linear compression, and the compressed parameters are made to correspond one-to-one to the original feature parameters. A third purpose is to reduce the storage capacity of the reproduction ROM by sharing data in the reproduction ROM for reproduction between different compression parameters. DESCRIPTION OF THE PREFERRED EMBODIMENTS The configuration of the present invention and its combined invention will be described below with reference to illustrated embodiments.

第３図は本発明およびその併合発明に係る音声パラメー
タ再生匍脚方式を用いた音声合成装置のプロツク図であ
る。同図に示すようにこの音声合成装置はデータ記録部
４０を含むＦｂＩ脚用１Ｃ（Ａ）と音声合成用１Ｃ（点
線部Ａ，Ｂを除いた部分）との２チツプで構成されてお
り、両者間でビツトシリアルにデータの受渡しを行なう
ようにしたものである。音声の特徴パラメータはすべて
再生用ＲＯＭｌ内に１０ビツトのデータとして記憶され
ており、各特徴パラメータに割り当てられるデータの個
数は、その特徴パラメータが音質に寄与する度合に応じ
て最適に配分されている。第４図は再生用ＲＯＭｌ内に
記憶されたＡ，Ｐ，ＫｌＯ−Ｋ１の各特徴パラメータの
データ個数を示している。例えばＡパラメータの場合１
０ビツトで表現されるデータが３２個記録されている。
したがつてＡパラメータの任意のデータをアクセスする
ときに必要とされる相対アドレスのビツト数は５ビツト
である。この相対アドレスは特徴パラメータを必要最小
限に圧縮して表現したものであるので圧縮パラメータと
呼ばれる。これに対して再生用ＲＯＭｌの内に記憶され
ている実際の特徴パラメータは再生パラメータと呼ばれ
る。上述した所から明らかなように再生パラメータのビ
ツト数は、Ａ，Ｐ，ＫｌＯ−Ｋ，の各特徴パラメータに
ついてすべて共通に１０ビツトであるが、圧縮パラメー
タのビツト数はＡ，Ｐ，ＫｌＯ−Ｋ，の各ラメータにつ
いて異なるものであり、それぞれ５，６，３，３，３，
３，４，４，４，５，６，７ビツト（合計５３ビツト）
である。そのほか予備エリアとして３ビツト分すなわち
データ８個分が再生用ＲＯＭに確保されている。かかる
圧縮パラメータは音声信号がほぼ定常状態とみなし得る
２０ｍｓｅｃ（１フレーム）ごとに１組（＝５３ビツト
）抽出されるのであるから、高々２６５０ビツト／秒で
音声信号を記録することができ、無音区間やリピート区
間をも考慮に入れると実際には１６００ビツト／秒程度
で音声信号を記録することができるものである。このよ
うな圧縮パラメータ（すなわち再生用ＲＯＭｌの相対ア
ドレス）は１フレームごとにデータ入力端子８から切換
回路１０を介してリングレジスタ３に直列に記憶される
ものであるが、このような相対アドレスだけで再生用Ｒ
ＯＭｌから記憶データを取り出すことができないので、
インデツクスＲＯＭ２の中に第５図に示すように記憶さ
れている先頭アドレスをアドレスカウンタ１１の制御の
下に順次取り出して、上記相対アドレスと加算回路４に
よつて加算することにより再生用ＲＯＭｌの絶対アドレ
ス（９ビツト）を計算し、該絶対アドレスによつて再生
用ＲＯＭｌをアクセスするようにしている。FIG. 3 is a block diagram of a speech synthesis apparatus using the speech parameter regeneration rambling method according to the present invention and its combined invention. As shown in the figure, this speech synthesis device is composed of two chips: an FbI leg 1C (A) including a data recording section 40, and a speech synthesis 1C (excluding dotted line parts A and B). Data is exchanged between the two in bit serial format. All voice characteristic parameters are stored as 10-bit data in the playback ROM, and the number of data assigned to each characteristic parameter is optimally distributed according to the degree to which that characteristic parameter contributes to sound quality. . FIG. 4 shows the number of data of each characteristic parameter of A, P, and KlO-Kl stored in the reproduction ROMl. For example, in the case of A parameter 1
Thirty-two pieces of data expressed by 0 bits are recorded.
Therefore, the number of relative address bits required when accessing arbitrary data of the A parameter is 5 bits. This relative address is called a compressed parameter because it represents the characteristic parameter compressed to the minimum necessary size. On the other hand, the actual characteristic parameters stored in the playback ROM1 are called playback parameters. As is clear from the above, the number of bits of the reproduction parameter is 10 bits in common for each of the feature parameters A, P, KlO-K, but the number of bits of the compression parameter is A, P, KlO-K. , are different for each parameter of 5, 6, 3, 3, 3, respectively.
3, 4, 4, 4, 5, 6, 7 bits (total 53 bits)
It is. In addition, a reserve area of 3 bits, ie, 8 pieces of data, is reserved in the reproduction ROM. Since one set (=53 bits) of such compression parameters is extracted every 20 msec (one frame), which can be considered as an almost steady state of the audio signal, it is possible to record the audio signal at a rate of at most 2650 bits/second, resulting in no sound. If sections and repeat sections are taken into account, it is actually possible to record audio signals at about 1600 bits/second. Such compression parameters (i.e. relative addresses of the playback ROM1) are stored in series in the ring register 3 from the data input terminal 8 via the switching circuit 10 for each frame, but only such relative addresses R for playback with
Since stored data cannot be retrieved from OMl,
The leading addresses stored in the index ROM 2 as shown in FIG. An address (9 bits) is calculated, and the reproduction ROM 1 is accessed using the absolute address.

第５図に示すようにインデツクスＲＯＭ２には圧縮パラ
メータのビツト配分数を３ビツトの２進数で記憶させて
おり、また本発明に対する併合発明の要部となる再生用
ＲＯＭｌの記憶容量削減のための共通化ビツトを１ビツ
ト設けており、さらに再生用ＲＯＭｌ内の予備エリアに
対応する予備ビツトを設けている。圧縮パラメータのビ
ツト配分数に関するデータは再生匍脚回路１２に送られ
、再生制御回路２は、該ビツト配分数だけシフトロツク
をリングレジスタ３に送出する。したがつてリングレジ
スタ３からは、上記ビツト配分数に応じて例えばＡパラ
メータの場合には５ビツト、Ｐパラメータの場合には６
ビツト、Ｋ，Ｏパラメータの場合には３ビツト・・・、
Ｋ，パラメータの場合には７ビツトという具合に圧縮パ
ラメータ（相対アドレス）をそれぞれ加算回路にシリア
ルに送出するものである。リングレジスタ３はできるだ
けチツプ面積をとらないようにダイナミツクシフトレジ
スタで構成されている。またインデツクスＲＯＭ２内に
記憶されている各特徴パラメータの再生用ＲＯＭＩ内に
おける先頭アドレスは、パラレルシリアル変換回路１３
を介して１ビツトずつ順次加算回路４に送出されるので
、順次１ビツトずつ加算されて絶対アドレスが計算され
るものである。こうして計算された直列の絶対アドレス
はシリアルパラレル変換装置１４を介して並列データに
変換され、再生用ＲＯＭＩをアクセスできるようになつ
ている。ところで再生用ＲＯＭＩ内における１０ビツト
の再生パラメータの配列の仕方は、Ａパラメータ、Ｐパ
ラメータ、およびＫＩＯ〜Ｋ１のように高次のＫパラメ
ータの場合には、ほとんど等間隔にデータを配列して差
し支えない。ところが、Ｋ，，Ｋ２，Ｋ３のような低次
のＫパラメータについては等間隔にデータを配列すると
都合が悪い。というのは、第６図に示すように通常人間
が用いている音声周波数は標本化周波数（例えば８ＫＨ
ｚ）に比べて充分に低い場合が多いので、隣り合つた標
本値同士（ＸｔとＸｔ−１）の相関関係を表わすＫ１パ
ラメータはほとんど１に等しい値を取り、Ｋ，パラメー
タの影響を最小２乗誤差による線形予測法にて除去した
ときのＸｔとＸ，−２との相関関係を表わすＫ２パラメ
ータは、ＸｔからＸ，−２までの傾きがほとんど変化し
ないために、ほとんど−１に等しい値をとるという特徴
があるからである。したがつて例えばＫ１パラメータの
場合には１に近い値を多数１０ビツトのデジタルデータ
に変換して再生用ＲＯＭＩに記憶させ、−１からｏまで
の値についてはあまり記憶させないようにする。同様に
Ｋ２パラメータについては−１に近い値について多数の
データを再生用ＲＯＭＩ内に記憶させ、ｏから１までの
値についてはあまり記憶させないようにする。本発明に
おいてはＫ，，Ｋ２，Ｋ３の各パラメータについてこの
ような頻度数に応じた細分化、すなわち非線形圧縮を施
しているものであるが、かかる非線形圧縮を施すこと自
体はすでに公知である。ところで、このようにして各圧
縮パラメータを再生パラメータに１対１に対応させる際
には再生用ＲＯＭＩに相当大きな記憶容量を必要とする
。As shown in FIG. 5, the index ROM 2 stores the number of bits allocated for compression parameters as a 3-bit binary number, and is also used to reduce the storage capacity of the playback ROM 1, which is the main part of the invention merged with the present invention. One common bit is provided, and a spare bit corresponding to a spare area in the reproduction ROM1 is also provided. Data regarding the bit allocation number of the compression parameter is sent to the reproduction leg circuit 12, and the reproduction control circuit 2 sends shift locks to the ring register 3 by the bit allocation number. Therefore, from the ring register 3, depending on the above bit allocation number, for example, in the case of the A parameter, 5 bits, and in the case of the P parameter, 6 bits are sent.
In the case of bit, K, O parameters, 3 bits...
In the case of K and parameters, compressed parameters (relative addresses) of 7 bits are each sent serially to the adder circuit. The ring register 3 is composed of a dynamic shift register so as to occupy as little chip area as possible. Furthermore, the leading address in the reproduction ROMI of each characteristic parameter stored in the index ROM 2 is determined by the parallel-serial conversion circuit 13.
Since the bits are sequentially sent to the adder circuit 4 via , the absolute address is calculated by sequentially adding bits one by one. The serial absolute address thus calculated is converted into parallel data via the serial/parallel converter 14, so that the reproduction ROMI can be accessed. By the way, regarding how to arrange the 10-bit playback parameters in the playback ROMI, in the case of A parameters, P parameters, and high-order K parameters such as KIO to K1, it is okay to arrange the data at almost equal intervals. do not have. However, for low-order K parameters such as K,, K2, and K3, it is inconvenient to arrange the data at equal intervals. This is because, as shown in Figure 6, the audio frequency normally used by humans is the sampling frequency (e.g. 8KH).
z), the K1 parameter, which represents the correlation between adjacent sample values (Xt and Xt-1), takes a value almost equal to 1, minimizing the influence of the K, The K2 parameter, which represents the correlation between Xt and X,-2 when removed by the linear prediction method using multiplicative error, has a value almost equal to -1 because the slope from Xt to X,-2 hardly changes. This is because it has the characteristic of taking . Therefore, for example, in the case of the K1 parameter, many values close to 1 are converted into 10-bit digital data and stored in the playback ROMI, while values from -1 to o are not often stored. Similarly, regarding the K2 parameter, a large amount of data for values close to -1 is stored in the reproduction ROMI, and not much data is stored for values from o to 1. In the present invention, each of the parameters K, K2, and K3 is subdivided according to the number of frequencies, that is, nonlinear compression is applied, but applying such nonlinear compression itself is already known. By the way, when each compression parameter corresponds to a playback parameter on a one-to-one basis in this way, a considerably large storage capacity is required in the playback ROMI.

もちろん、Ｋ８，Ｋ８，ＫＩＯのような高次のＫパラメ
ータについてはデータ数がそれぞれ８個ずつしかないか
ら特に問題はないが、Ｋ，やＫ２のような低次のＫパラ
メータはデータ数がそれぞれ１２８個、６４個もあり、
高次のＫパラメータとは比較にならないほど大きな記録
容量を必要とするものである。そこで、Ｋ１パラメータ
の頻度分布が正負の符号を反転させればＫ２パラメータ
の頻度分布とよく似ていることを利用して、Ｋ，パラメ
ータの再生用ＲＯＭＩ内におけるデータをＫ２パラメー
タ用のデータとして共用し、再生用ＲＯＭＩの記憶容量
の削減を図ろうというのが本発明およびその併合発明の
趣旨である。具体的にこれを行なうためには、第Ｔ図に
示すような回路構成を用いるものである。Of course, there is no particular problem with high-order K parameters such as K8, K8, and KIO, since the number of data is only 8 each, but low-order K parameters such as K, and K2 each have the number of data. There are 128 and 64 pieces,
This requires an incomparably large storage capacity compared to high-order K parameters. Therefore, by taking advantage of the fact that the frequency distribution of the K1 parameter is very similar to the frequency distribution of the K2 parameter if the positive and negative signs are reversed, the data in the ROMI for reproducing the K parameter can be shared as the data for the K2 parameter. However, the purpose of the present invention and its combined inventions is to reduce the storage capacity of the playback ROMI. To specifically accomplish this, a circuit configuration as shown in FIG. T is used.

この回路はインデツクスＲＯＭ２内に設けられた共通化
ビツト０Ｄがｏであるときには従来例と同様に動作する
ものであるが、第５図のＫ２に示すように共通化ビツト
０Ｄが１であるときにはこれとは異なつた動作をする。
まずリングレジスタ３からビツトシリアルに送出されて
来る圧縮パラメータはビツト反転回路２９によつて論理
値１，０を反転させられる。第８図はビツト反転回路２
９の一回路例を示しており、同図に示すように共通化ビ
ツト０Ｄが１のときには圧縮パラメータの論理値が反転
されて出力されるものである。またリングレジスタ３か
ら圧縮パラメータを１ビツトずつ取り出すためのリクエ
スト信号（シフトクロツク）は１ビツト遅延回路３０に
よつて１ビツト分のタイミングだけ遅延させられる。と
ころがインデツクスＲＯＭ２から１ビツトずつ送出され
て来る絶対アドレスは遅延せずに送られて来るので、結
果的には本来６ビツトのＫ２パラメータが７ビツトに伸
張されて絶対アドレスと加算されることになる。しかも
第５図に示すようにＫ２パラメータの先頭アドレスはＫ
，パラメータの先頭アドレスと同じアドレスに設定して
あるので、Ｋ２パラメータの相対アドレスはＫ１パラメ
ータよりもビツト数の足りない１ビツト分だけ桁上げさ
れて、再生用ＲＯＭＩ内のＫ，パラメータに関するデー
タを１データおきにアクセスすることになる。こうして
Ｋ１パラメータのデータを流用して再生されたＫ２パラ
メータの再生値は正負反転回路３１によつて符号を反転
された上で補間計算回路５に送出されるものである。第
９図Ａ，ｂはこれらの一連の動作を表わすタイミングチ
ヤートであり、同図ａに示すようにＬＯＡＤ信号が入る
とリクエストクロツク信号ＣＬｒ８：１が再生制御回路
１２に送出され、同回路１２内のアツプカウンタが（１
１１）となるまで、すなわち各圧縮パラメータのビツト
配分数に相当する分だけリクエスト信号を出力する、と
ころが共通化ビツト０Ｄが１であるときにはＮＡＮＤ回
路３２がＡＮＤゲート３３，３４の入力をインヒビツト
するために最初のリクエスト信号は出力されずリクエス
トクロツク信号ＣＬｒｅｑが１個入つてフリツプフロツ
プ３５が反転したのち初めてリクエスト信号が出力され
る。その後の動作は第９図ａの場合と同様であり、再生
Ｆｂｌ脚回路１２内のアツプカウンタが（１１１）とな
るまでリクエスト信号が出力されるので結果的には１ビ
ツト分のタイミングだけ遅延したリクエスト信号が出力
されることになる。本発明においてはかかる一連の動作
を行なうか否かを、インデツクスＲＯＭ２内に記憶され
た共通化ビツト０Ｄによつて判断するものであるから、
インデツクスＲＯＭ２内の共通化ビツトのうちＫ２パラ
メータの部分に１を記憶させておくと共に、先頭アドレ
スをＫ１ニパラメータと同じアドレスに設定しておきさ
えすれば、再生用ＲＯＭｌのうちＫ２パラメータに関す
るデータ（６４個）を削減することができるものである
。ところで再生用ＲＯＭｌから出力される特徴パニラメ
ータは１フレームごとに更新されるものであるが、デー
タを更新する際に各フレーム間の接続点において特徴パ
ラメータが不連続的に変化すると音声信号に歪みを生じ
て明瞭度が低下するおそれがあるので、データ更新の際
に特徴パラメータ５がスムーズに変化し得るように補
間計算回路５を設けて１フレーム内の８点において近似
的な直線的補間を行なうようにしている。This circuit operates in the same way as the conventional example when the common bit 0D provided in the index ROM 2 is o, but when the common bit 0D is 1 as shown at K2 in FIG. behaves differently.
First, the compression parameters sent bit-serially from the ring register 3 are inverted to logical values 1 and 0 by the bit inversion circuit 29. Figure 8 shows bit inversion circuit 2.
As shown in the figure, when the common bit 0D is 1, the logical value of the compression parameter is inverted and output. Further, a request signal (shift clock) for extracting compression parameters bit by bit from the ring register 3 is delayed by a timing corresponding to one bit by a one-bit delay circuit 30. However, since the absolute address sent bit by bit from index ROM2 is sent without delay, the K2 parameter, which is originally 6 bits, is expanded to 7 bits and added to the absolute address. . Moreover, as shown in Figure 5, the starting address of the K2 parameter is K
, is set to the same address as the start address of the parameter, so the relative address of the K2 parameter is carried up by one bit, which is less than the K1 parameter, and the data related to the K parameter in the playback ROMI is stored. Every other piece of data will be accessed. The reproduced value of the K2 parameter thus reproduced by using the data of the K1 parameter is sent to the interpolation calculation circuit 5 after its sign is inverted by the positive/negative inversion circuit 31. FIGS. 9A and 9B are timing charts showing these series of operations. As shown in FIG. 9A, when the LOAD signal is input, the request clock signal CLr8:1 is sent to the reproduction control circuit 12; The up counter in (1
11), that is, the request signal is output in an amount corresponding to the number of bits allocated to each compression parameter. However, when the common bit 0D is 1, the NAND circuit 32 inhibits the inputs of the AND gates 33 and 34. The first request signal is not outputted, and only after one request clock signal CLreq is input and the flip-flop 35 is inverted, the request signal is outputted. The subsequent operation is the same as in the case of FIG. 9a, and the request signal is output until the up counter in the reproduction Fbl leg circuit 12 reaches (111), resulting in a timing delay of 1 bit. A request signal will be output. In the present invention, whether or not to perform such a series of operations is determined based on the common bit 0D stored in the index ROM 2.
By storing 1 in the K2 parameter portion of the common bits in the index ROM 2 and setting the start address to the same address as the K1 parameter, the data related to the K2 parameter in the playback ROM 1 ( 64) can be reduced. By the way, the feature parameters output from the playback ROM1 are updated every frame, but if the feature parameters change discontinuously at the connection points between each frame when updating the data, distortion may occur in the audio signal. Therefore, an interpolation calculation circuit 5 is provided to perform approximate linear interpolation at 8 points within one frame so that the feature parameters 5 can change smoothly when updating data. That's what I do.

このため、タイミング匍脚回路２８では第２図に示すよ
うに１フレーム（２０ｍｓｅｃ）中に８個の補間用Ｄク
ロツク５（２．５ｍｓｅｃ）を発生し、１個のＤクロ
ツク中に２５個のパラメータ読込用Ｐクロツク（１００
μＳｅｃ）、さらに１個のＰクロツク中に２２個のビツ
ト読込用Ｔクロツク（４．５μＳｅｃ）が作成される。
８個のＤクロツクのうち、最初のＤ１においてデ４ータ
入力端子８からリングレジスタ３にデータが読み込まれ
る。Therefore, the timing leg circuit 28 generates 8 interpolation D clocks 5 (2.5 msec) during one frame (20 msec), and 25 interpolation clocks 5 during one D clock, as shown in FIG. P clock for reading parameters (100
Further, 22 bit reading T clocks (4.5 μSec) are created in one P clock.
Data is read into the ring register 3 from the data input terminal 8 at the first D1 of the eight D clocks.

各圧縮パラメータＡ，Ｐ，Ｋ，Ｏ・・・Ｋ１は奇数番目
のＰクロツクで順次読み込まれるものであり、例えばＡ
パラメータはＰ１区間のＴ５〜Ｔ，Ｏの５個のＴクロツ
クで読み込まれる。偶数番目のＰクロツクあるいは上記
以外のＴクロツクは補間計算回路５、音源ＲＯＭ６、デ
ジタルフイルタ７などのタイミングとして使用されるも
のである。上記補間計算回路５によつて２．５ｍｓｅｃ
ごとに新しい値に更新された各特徴パラメータは、それ
ぞれＰラツチ１６、ＡＫラツチ２３に一時的に蓄えられ
る。Each compression parameter A, P, K, O...K1 is read sequentially at the odd-numbered P clock.
The parameters are read at five T clocks from T5 to T and O in the P1 section. The even-numbered P clocks or T clocks other than those mentioned above are used as timing for the interpolation calculation circuit 5, the sound source ROM 6, the digital filter 7, etc. 2.5 msec by the interpolation calculation circuit 5
Each characteristic parameter updated to a new value is temporarily stored in the P latch 16 and the AK latch 23, respectively.

ただし、補間計算に差し当り必要のないパラメータはす
べてＡＫパラメータスタツク２４に転送してデジタルフ
イルタ７の音声合成用データとして蓄積している。一方
Ｐラツチ１６に蓄えられた音声の基本周期に関するデー
タは一致回路１７、カウンタ１８を介して音源ＲＯＭ６
に送られ、音声に基本周期がある場合には有声音源１９
を駆動して上記基本周期を有するパルス信号を発生させ
る。音声に基本周期がない場合には、音源制御回路２０
にて切換回路２２を駆動し、無声音源２１に切り換える
。無声音源２１は基本周期を持たないホワイトノイズ（
白雑音）を発生するものである。次にＡパラメータおよ
びＫパラメータはデジタルフイルタ７に供給され、音源
回路より供給された信号に振幅の大小およびスペクトル
分布に関する情報を付け加えることにより音声を再生す
るものである。なお、第３図においては２５はアンプ、
２６はスピーカ、２７は水晶発振回路、９はパラメータ
コード検出回路であるが、これらは本発明の要旨には直
接的には関連しないのでその詳細な説明は省略する。本
発明は上述のように構成されたもので、音声波形の情報
を圧縮してデータ記録部に記録する際に、各圧縮パラメ
ータを再生用ＲＯＭの相対アドレスとして構成すると共
に各特徴パラメータ群の再生用ＲＯＭ内の先頭アドレス
をデータ記録部と同一順序で記憶したインデツクスＲＯ
Ｍを設けて、このインデツクスＲＯＭのアドレスカウン
タを順次インクリメントすることにより、データとして
読込まれた各圧縮パラメータ（相対アドレス）にそれぞ
れ対応する先頭アドレスを加算し、これを再生用ＲＯＭ
の絶対アドレスとして元の特徴パラメータを再生するよ
うになつているので、各特徴パラメータを再生するのに
換算テーブルとしての再生用ＲＯＭとその先頭アドレス
を参照するためのインデツクスＲＯＭがあればよく、何
ら複雑な演算を必要としないという利点があり、特に再
生用ＲＯＭに記憶させる各特徴パラメータの値は任意に
選べるので各パラメータの頻度の高い部分を細かく分割
して非線形圧縮することもできるという利点がある。However, all parameters that are not required for the time being for interpolation calculations are transferred to the AK parameter stack 24 and stored as data for speech synthesis in the digital filter 7. On the other hand, the data regarding the fundamental period of the voice stored in the P latch 16 is transferred to the sound source ROM 6 via a matching circuit 17 and a counter 18.
and if the voice has a fundamental period, it is sent to the voiced sound source 19.
is driven to generate a pulse signal having the above basic period. If the sound has no fundamental period, the sound source control circuit 20
The switching circuit 22 is driven to switch to the silent sound source 21. The unvoiced sound source 21 is white noise (
white noise). Next, the A parameter and the K parameter are supplied to the digital filter 7, which reproduces the sound by adding information regarding amplitude magnitude and spectral distribution to the signal supplied from the sound source circuit. In addition, in Fig. 3, 25 is an amplifier,
26 is a speaker, 27 is a crystal oscillation circuit, and 9 is a parameter code detection circuit, but since these are not directly related to the gist of the present invention, detailed explanation thereof will be omitted. The present invention is configured as described above, and when compressing audio waveform information and recording it in the data recording section, each compression parameter is configured as a relative address of a reproduction ROM, and each characteristic parameter group is reproduced. Index RO that stores the start address in the ROM in the same order as the data recording section
By sequentially incrementing the address counter of this index ROM, the corresponding start address is added to each compression parameter (relative address) read as data, and this is added to the playback ROM.
Since the original feature parameters are played back as the absolute addresses of It has the advantage of not requiring complicated calculations, and in particular, the value of each feature parameter to be stored in the playback ROM can be arbitrarily selected, so it has the advantage of being able to finely divide parts with high frequency of each parameter and perform non-linear compression. be.

またインデツクスＲＯＭの１ワード中には圧縮パラメー
タのビツト配分数を２進数にて記録してあるのでアドレ
スカウンタをインクリメントするのみでデータ読込みの
際ビツト配分数をデコードしてデータ読込みに必要なリ
クエストパルスを出力することができ、かつ絶対アドレ
スと相対アドレスとを１ビツト加算器でシリアル加算す
る際の所要数のシフトクロツクを作成することができる
という利点がある。また本発明の改良に係る併合発明に
おいては、再生パラメータを記録している再生用ＲＯＭ
に予備エリアを設け、圧縮パラメータに加算して再生用
ＲＯＭをアクセスする絶対アドレスを与えるためのイン
デツクスＲＯＭに予備の１ワードを設けたので、用途に
応じて回路を変更する必要がなく例えば短いメツセージ
で明瞭度を重要視する場合には予備ビツトを減らしてＫ
，およびＫ２パラメータのビツト配分を１ビツトずつ増
加させたり、音質の個人差によつて各Ｋパラメータのビ
ツト配分を変更するというようにＲＯＭ部分の書き込み
内容を変更するだけでよいので、汎用の音声合成用ＬＳ
Ｉを量産できるという利点がある。In addition, the bit allocation number of the compression parameter is recorded in binary in one word of the index ROM, so when reading data, by simply incrementing the address counter, the bit allocation number is decoded and the request pulse required for data reading is generated. It has the advantage that it can output the required number of shift clocks when serially adding an absolute address and a relative address with a 1-bit adder. In addition, in the combined invention related to the improvement of the present invention, a playback ROM that records playback parameters is provided.
A spare area is provided in the index ROM, and a spare word is provided in the index ROM that is added to the compression parameters to provide an absolute address for accessing the playback ROM.Therefore, there is no need to change the circuit depending on the application, and for example, it is possible to write short messages. If clarity is important, reduce the number of spare bits.
, and the bit allocation of the K2 parameter one bit at a time, or change the bit allocation of each K parameter depending on individual differences in sound quality. LS for synthesis
It has the advantage of being able to mass produce I.

またインデツクスＲＯＭに記録しているデータの１ワー
ド中に各パラメータの先頭アドレスと共に各圧縮ビツト
数の２進化値を含んでいるので、先頭アドレスを読み出
すと同時に圧縮ビツト数だけのシフトクロツクを作るこ
とができ、したがつて不均一に配分された各圧縮パラメ
ータを必要ビツト数だけビツトシリアルに読み込む回路
を簡単に構成できるという利点があり、したがつてまた
、汎用の音声合成用ＬＳＩと交換可能な時計用、インタ
ホーン用、警報器用など各種の匍擲用ＬＳＩとの２チツ
プシステムをきわめて少い接続ピン数で安価に構成でき
るという利点がある。さらに本発明の改良に係る他の併
合発明においては、Ｋ，パラメータに関するデータを記
憶せるＫ，パラメータ再生用ＲＯＭのデータ数を０乃至
１の領域において多く配分すると共に、−１乃至ｏの領
域において少なく配分し、該Ｋ１パラメータ再生用ＲＯ
Ｍのアドレス信号をＫ，パラメータの圧縮パラメータと
し、該圧縮パラメータにて上記ＲＯＭをアクセスするこ
とによりＫ，パラメータに関するデータを再生するよう
にし、Ｋ２パラメータの圧縮パラメータのビツト数をＫ
１パラメータの場合よりも少なくし、Ｋ２パラメータの
圧縮パラメータの否定論理をとると共に、Ｋ１パラメー
タの圧縮パラメータよりもビツト数の足りない分だけ桁
上げして該桁上げしたデータをアドレスとしてＫ１パラ
メータ再生用ＲＯＭをアクセスし、その出力の正負を反
転することによりＫ２パラメータに関するデータを再生
するように構成したので、Ｋ２パラメータ再生用ＲＯＭ
を省略することができ、従来再生用ＲＯＭの中で大きな
記憶容量を占めていたＫ２パラメータに関するデータ（
６４個）を削減することができるという利点がある。In addition, one word of data recorded in the index ROM contains the start address of each parameter as well as the binary value of each compression bit number, so it is possible to create shift clocks for the number of compression bits at the same time as reading the start address. This has the advantage that it is easy to configure a circuit that reads the necessary number of unevenly distributed compression parameters bit-serially. It has the advantage that a two-chip system with various types of portable LSIs, such as those for commercial use, intercom use, and alarm use, can be constructed at low cost with an extremely small number of connection pins. Further, in another merged invention related to an improvement of the present invention, the number of data of K, which stores data related to parameters, and a ROM for reproducing parameters is allocated to a large number in the range of 0 to 1, and in the range of -1 to o. Allocate less and use the RO for regenerating the K1 parameters.
The address signal of M is set as the compression parameter of the K parameter, and by accessing the ROM using the compression parameter, data regarding the K parameter is reproduced, and the number of bits of the compression parameter of the K2 parameter is set to K.
It is smaller than in the case of 1 parameter, and the negative logic of the compression parameter of the K2 parameter is used, and it is carried by the number of bits that is insufficient than the compression parameter of the K1 parameter, and the K1 parameter is reproduced using the carried data as an address. The K2 parameter reproducing ROM is
can be omitted, and data related to K2 parameters (which previously occupied a large storage capacity in playback ROMs) can be omitted.
64) can be reduced.

すなわち本発明は、人間の通常用いている音声周波数は
標本化周波数に比べて充分に低い周波数領域において多
く分布しているという点に着目し、このことからＫ１パ
ラメータおよびＫ２パラメータの頻度分布について物理
的な考察を加え、両者の頻度分布が必ず相補的になると
いう性質を利用してＫ，パラメータの頻度分布に適合し
た再生用ＲＯＭ内のデータを所定の反転回路を介してＫ
２パラメータにも共用することにより、再生用ＲＯＭの
記憶容量を大幅に削減したものであり、音声合成用ＬＳ
Ｉの製作容易化に大いに貢献するものである。In other words, the present invention focuses on the fact that the voice frequencies normally used by humans are largely distributed in a frequency range sufficiently lower than the sampling frequency, and based on this, the frequency distribution of the K1 parameter and the K2 parameter is physically determined. Taking into account the fact that their frequency distributions are always complementary, data in the playback ROM that matches the frequency distribution of K and parameters is transferred to K via a predetermined inverting circuit.
By sharing the two parameters, the storage capacity of the playback ROM is greatly reduced, and the LS for speech synthesis
This greatly contributes to the ease of manufacturing I.

[Brief explanation of the drawing]

第１図は本発明に用いるＰＡＲＣＯＲ係数の原理説明図
、第２図は同上に用いた各クロツクの説明図、第３図は
本発明およびその併合発明の一実施例の全体プロツク図
、第４図は同上に用いる再生用ＲＯＭの構成を示す説明
図、第５図は同上に用いるインデツクスＲＯＭの構成を
示す説明図、第６図は同上に用いるＰＡＲＣＯＲ係数の
頻度分布を示すグラフ、第Ｔ図は本発明の要部プロツク
図、第８図は同上の部分拡大回路図、第９図Ａ，ｂは同
上のタイムチヤートである。１は再生用ＲＯＭ）２はインデツクスＲＯＭ）３はリ
ングレジスタ、４は加算回路、５は補間計算回路、６は
音源ＲＯＭ，．Ｔはデジタルフイルタである。FIG. 1 is an explanatory diagram of the principle of PARCOR coefficients used in the present invention, FIG. 2 is an explanatory diagram of each clock used in the above, FIG. 3 is an overall block diagram of an embodiment of the present invention and its combined invention, and FIG. Figure 5 is an explanatory diagram showing the configuration of the reproducing ROM used in the above, Figure 5 is an explanatory diagram showing the configuration of the index ROM used in the same, Figure 6 is a graph showing the frequency distribution of PARCOR coefficients used in the same, and Figure T 8 is a partially enlarged circuit diagram of the same, and FIGS. 9A and 9B are time charts of the same. 1 is a playback ROM) 2 is an index ROM) 3 is a ring register, 4 is an addition circuit, 5 is an interpolation calculation circuit, 6 is a sound source ROM, . T is a digital filter.

Claims

[Scope of Claims] 1. Each characteristic parameter representing the amplitude, fundamental period, spectral distribution, etc. of a sound waveform is compressed to the number of bits corresponding to the degree of contribution to sound quality and recorded in a data recording section, and from the data recording section. In an audio parameter reproduction control method that reproduces audio by reproducing original characteristic parameters from sequentially read compression parameters, a reproduction ROM in which a predetermined number of values of each characteristic parameter is stored in advance.
Then, an index ROM is provided in which the leading address of each feature parameter group of this playback ROM is stored in the same order as each compression parameter arranged in the data, and the relative address in the playback ROM of each feature parameter is stored. The compression parameters correspond to the characteristic parameters, and each compression parameter in the above data read from the data recording section is added to the top address sequentially read from the index ROM, thereby corresponding to the compression parameters in each data from the playback ROM. 1. A sound parameter playback control method, characterized in that an absolute address for reading out characteristic parameters is created. 2. Each audio parameter representing the amplitude, fundamental period, and spectral distribution of the audio waveform is quantized with the same number of bits, and each audio parameter is compressed linearly or nonlinearly with non-uniform bit allocation, and each audio parameter value is quantized one-to-one. In the audio parameter playback control method, the audio waveform data is read in the form of the compression parameter and reproduced to the original audio parameter to reproduce the audio waveform from each audio parameter. a playback ROM storing audio parameter values of the compressed number of bits each time, and one word containing the start address of each audio parameter group in the playback ROM and a binary value of the compressed bit number. and an index ROM having a number of words equal to the number of types of audio parameters plus one word for a spare bit, and the compression parameter is defined as the amount of displacement of the corresponding audio parameter from the top address in the playback ROM, and An address counter is provided to sequentially read each word from the index ROM, and the same number of data request pulses as the number of compressed bits read out is generated to read audio waveform data consisting of compressed parameters bit serially. The playback ROM is accessed using an address formed by adding the first address read from the index ROM to the compression parameter to play back each audio parameter corresponding to the read data, and the data is read from the index ROM. A reproduction control method for audio parameters, characterized in that the data request pulse is not generated when a word for a bit is read out, and a spare area corresponding to the spare bit is provided in a playback ROM. . 3 Compression parameters representing the amplitude, fundamental period, spectral distribution, etc. of the audio waveform are compressed to the number of bits corresponding to the degree of contribution to sound quality and recorded in the data recording section, and are sequentially read out from the data recording section. In an audio parameter playback control method that reproduces audio by reproducing original feature parameters from The partial autocorrelation coefficient between sample values is a first-order coefficient, and the partial autocorrelation coefficient between adjacent sample values separated by one sample value is a second-order coefficient, and 1
ROM for primary coefficient reproduction that can store data related to secondary coefficients
A large amount of data is allocated in the 0 to 1 area, and a small amount is allocated in the -1 to 0 area, and the address signal of the primary coefficient reproduction ROM is used as the compression parameter of the primary coefficient, and the compression parameter By accessing the ROM, the data regarding the primary coefficient is reproduced, the number of bits of the compression parameter of the secondary coefficient is made smaller than that of the primary coefficient, and the negative logic of the compression parameter of the secondary coefficient is applied. , carry up by the number of bits missing from the primary compression parameter, access the primary coefficient regeneration ROM using the carried data as an address, and invert the sign of the output to obtain data related to the secondary coefficient. A method for controlling reproduction of audio parameters, characterized in that a ROM for reproducing secondary coefficients is omitted while reproducing data.