JP3885989B2

JP3885989B2 - Speech complementing method, speech complementing apparatus, and telephone terminal device

Info

Publication number: JP3885989B2
Application number: JP2001190422A
Authority: JP
Inventors: 理恵長戸; 隆之石黒
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2001-06-22
Filing date: 2001-06-22
Publication date: 2007-02-28
Anticipated expiration: 2021-06-22
Also published as: JP2003008745A

Abstract

PROBLEM TO BE SOLVED: To provide a sound-complementing method which can complement the sound, with user's own voice, without imposing complicated operation to the user. SOLUTION: In the sound-complementing method which complements the sound information when a telephone terminal performs sound communication with other telephone terminal via a specified communication network, the sound information from the telephone terminal is registered in advance, in a sound complementing information storage means connected to the specified communication network, and the sound information including the sound information transmitted from the telephone terminal is extracted from the sound information registered in that sound-complementing information storage means, when the voice of the user is inputted while the telephone terminal is performing the sound communication with the other telephone terminal, and the extracted sound information is sent to the other telephone terminal stated.

Description

【０００１】
【発明の属する技術分野】
本発明は、音声補完方法及び音声補完装置に係り、詳しくは、音声通話の際に電話端末ユーザの音声を補完することで電話端末の省電力化が可能となる音声補完方法及び音声補完装置並びに電話端末に関する。
【０００２】
【従来の技術】
現在のセルラー移動通信システムには、自動車電話、携帯電話（例：ＰＤＣ（Personal Digital Cellular）方式の移動通信システム）、簡易携帯電話（例：ＰＨＳ(Personal Handy phone System)などがある。これらの移動通信システムを使用する際には、移動端末（例：携帯電話）から相手先の電話番号を入力し、相手に接続された後に音声あるいはデータを相手先に送信する。
【０００３】
移動端末をユーザが使用する場合、該ユーザからの入力操作が必要となるケースは、例えば、▲１▼電話番号の入力や電話帳の登録などを行う場合、▲２▼電子メールなどの文字データを作成する場合、▲３▼受信した文字データを読む場合、▲４▼相手との通話を行う場合などがある。従来、これらの入力（▲１▼〜▲４▼）を補助する方法として以下のような技術が提案されている。
【０００４】
１．音声入力により電話装置の操作が行えるような端末操作を補助（上記▲１▼に対応）する技術として特開平８−２３３６９が提案されている。また、回線番号をボタンで押すのではなく、「言葉のコード」を用いてコードを発生することにより接続を可能とする技術が特開２０００−７８２６７で開示されている。更に、相手先が発した音声から電話番号を抽出し、電話帳に登録することのできる技術が特願平１０−３６９７７８で開示されている。
【０００５】
２．携帯端末に蓄えられた電子メールのテキストデータを、合成音声にて形態情報端末の使用者に知らせるような文字データの作成補助（上記▲２▼に対応）、受信データの読み出し補助（上記▲３▼に対応）に関する技術として特開平９−３２１８４が開示されている。
【０００６】
３．通話の補助を行う技術（上記▲４▼に対応）として電話端末に送信すべき内容をデータ入力すると、合成音声信号に変換し電話回線に送出される技術が特開平９−３２１８４で開示されている。また、使用者が特定の音声（言葉）を発生した場合に自動的に電話回線に接続し、予め記憶してある音声を出力して通報する技術が特開平６−１３１５８３に開示されている。
【０００７】
【発明が解決しようとする課題】
上述したように、従来の方法では、ユーザが文字入力したデータを音声に変える方法のみであった。この場合、ユーザが伝えたいことを伝えるためにはキー操作を行って文字入力する必要があるため、そのキー操作のための時間がかかると共に煩わしい操作をユーザに強いてしまうという問題があった。また、合成した音声（コンピュータ等に音声を喋らせる音声合成）を利用する場合、韻律(アクセントやイントネーション)の制御や明瞭性などに問題が残されており、極めて人間のものに近い音声を生成するのが現状では難しい。従って、従来の方法では、ユーザ本人の声と異なるため違和感を生じる。
【０００８】
そこで、本発明の第一の課題は、ユーザに煩雑な入力操作を課すことなく、ユーザ自身の声で音声補完をできるようにした音声補完方法及び音声補完装置並びに電話端末装置を提供することである。
【０００９】
【課題を解決するための手段】
上記第一の課題を解決するため、本発明は、請求項１に記載されるように、電話端末が他の電話端末と所定通信網を介して音声通信を行う際に音声情報の補完を行う音声補完方法において、上記電話端末から音声情報を所定通信網に接続された音声補完情報格納手段に予め登録し、上記電話端末と他方の電話端末とが音声通信を行っている際に、ユーザの音声が入力されたとき、上記電話端末から送信された音声情報を含む音声情報を該音声補完情報格納手段に登録されている音声情報から抽出し、その抽出された音声情報を上記他方の電話端末に送信するように構成される。
【００１０】
このような音声補完方法では、電話端末のユーザ（以下、ユーザＡという）が他の電話端末のユーザ（以下、ユーザＢという）と通話しているときに、その通話中の音声情報を含む音声情報が所定網に接続された音声補完情報格納手段に登録されている音声情報から抽出され、その抽出された音声情報がユーザＢの電話端末に対して送信される。
【００１１】
例えば、ユーザＡの端末から「おは」という音声情報が送信されると、音声補完情報格納手段は、予め登録されている音声情報から「おは」を含む「おはようございます」という音声情報を抽出する。このようにして抽出された「おはようございます」の音声はユーザ自身の音声でユーザＢの電話端末に送信されるので、ユーザＢは違和感なく聞くことができる。即ち、本発明によれば、ユーザＡから発せられた「おは」の音声情報から差分となる「ようございます」の音声情報の補完が行われるようになっている。その結果、ユーザ間の会話がスムーズに運ぶように支援することが可能である。
【００１２】
本発明における上記音声補完情報格納手段は、所定網に接続された通信事業者のネットワーク装置、あるいは新たなノードとして設けてもよく設置場所を限定しない。
【００１３】
上記音声補完情報格納手段に登録されている音声情報を電話端末にも登録することができるという観点から、本発明は、請求項２に記載されるように、上記音声補完方法において、上記音声補完情報格納手段は、登録された音声情報を上記電話端末に送信し、上記電話端末は、上記音声情報を受信して登録するように構成される。
【００１４】
ユーザから発せられる音声情報のうち使用頻度の高い音声情報を抽出して自動登録できるという観点から、本発明は、請求項３に記載されるように、上記音声補完方法において、上記音声補完情報格納手段は、上記電話端末と他方の電話端末とが音声通信を行っている際に、ユーザから送信される音声情報のうち出現頻度の高い音声情報を抽出して自動登録するように構成される。
【００１５】
また、上記ユーザにて使用頻度の高い音声情報を電話端末にも登録できるという観点から、本発明は、請求項４に記載されるように、上記音声補完方法において、上記電話端末は、上記音声補完情報格納手段にて抽出された出現頻度の高い音声情報を該音声補完情報格納手段より受信して登録するように構成される。
【００１６】
ユーザ自身にて上記音声補完情報格納手段に登録する音声を自由に選択し登録することができるという観点から、本発明は、請求項５に記載されるように、上記音声補完方法において、上記音声補完情報格納手段は、音声情報または出現頻度の高い音声情報のいずれかを蓄積し、蓄積された音声情報または該音声情報を伝達するための情報を上記電話端末に通知し、上記電話端末は、その通知に基づいて上記音声補完情報格納手段に登録させるべき音声情報がユーザにて選択された後、その選択結果を上記音声補完情報格納手段に報告し、上記音声補完情報格納手段は、上記無線端末からの報告に基づいて音声情報を登録するように構成される。
【００１７】
また、本発明は、請求項６に記載されるように、上記音声補完方法において、上記電話端末は、上記音声補完情報格納手段に登録させるべき音声情報がユーザにて選択された後、その選択結果に基づいて得られる音声情報を登録するように構成される。
【００１８】
音声補完情報格納手段にて補完された音声情報と同一の音声情報を再生させることが可能になるという観点から、本発明は、請求項７に記載されるように、上記音声補完方法において、上記電話端末と他方の電話端末とが音声通信を行っている際に、ユーザの音声が入力されたとき、音声補完情報格納手段は、上記電話端末から送信された音声情報を含む音声情報を該音声補完情報格納手段に登録されている音声情報から抽出して、その抽出した音声情報と同一の音声情報を再生させるための指示となる信号を上記電話端末に送信し、上記電話端末は、上記指示に従って予め登録されている音声情報の再生を行うように構成される。
【００１９】
このような音声補完方法では、音声補完情報格納手段は、ユーザＡが発した音声情報から抽出した音声情報を該ユーザＡに対し送信するのでなく、該抽出した音声情報をユーザＡの電話端末に予め登録されている音声情報から読み出して再生させるための指示を送る。即ち、音声補完情報格納手段は、ユーザＡの電話端末に対し上記指示となる信号のみを該電話端末に送信するだけなので、音声情報送信に関する無線リソースの節約が可能となる。
【００２０】
電話端末の消費電力をより低減することができるという観点から、本発明は、請求項８に記載されるように、上記音声補完方法において、上記音声補完情報格納手段は、該音声補完情報格納手段で抽出された音声情報と同一の音声情報を再生させるための指示を上記電話端末に対して送信するときに、上記電話端末が上記音声補完情報格納手段からの指示に基づいて音声情報を再生している間、音声情報の送信を停止させる指示を上記電話端末に送信し、上記電話端末は、上記指示に従って予め登録されている音声情報の再生を行っている間、ユーザからの音声入力に係らず音声送信を停止するように構成される。
【００２１】
このような音声補完方法では、電話端末が登録された音声を再生している間、ユーザからの音声入力に係らず音声送信の出力が停止されるので、該電話端末の消費電力を低減することが可能になる。
【００２２】
また、本発明は、請求項９に記載されるように、電話端末が他の電話端末と所定通信網を介して音声通信を行う際に音声情報の補完を行う音声補完方法において、上記電話端末から音声情報を所定通信網に接続された音声補完情報格納手段に予め登録し、上記電話端末と他方の電話端末とが音声通信を行っている際に、ユーザの音声が入力されたとき、上記電話端末から送信された音声情報を含む音声情報を該音声補完情報格納手段に登録されている音声情報から抽出し、その抽出された音声情報を上記電話端末及び他方の電話端末に送信するように構成される。
【００２３】
更に、上記電話端末として移動端末（例：携帯電話）を用いることができるという観点から、本発明は、請求項１０に記載されるように、上記音声補完方法において、上記電話端末として所定の通信網に接続される移動端末装置を用いるように構成される。
【００２４】
また、上記課題を解決するため、本発明は、請求項１１に記載されるように、電話端末が他の電話端末と所定通信網を介して音声通信を行う際に音声情報の補完を行う音声補完装置において、所定通信網に接続され、上記電話端末からの音声情報を予め登録する音声補完情報格納手段と、上記電話端末と他方の電話端末とが音声通信を行っている際に、ユーザの音声が入力されたとき、上記電話端末から送信された音声情報を含む音声情報を該音声補完情報格納手段に登録されている音声情報から抽出する音声情報抽出手段と、その抽出された音声情報を上記他方の電話端末に送信する音声情報送信手段とを有するように構成される。
【００２５】
また、更に、上記課題を解決するため、本発明は、請求項１８に記載されるように、所定通信網を介して他の電話端末と通信を行う電話端末装置において、
上記所定通信網には、上記電話端末が音声通信を行う際に音声情報の補完を行う音声補完装置が接続され、上記音声補完装置には、上記電話端末からの音声情報が音声補完情報格納手段に予め登録され、上記電話端末から送信された音声情報を含む音声情報を音声補完情報格納手段に登録されている音声情報から抽出し、抽出された音声情報を上記他方の電話端末に送信し、上記電話端末は、上記音声補完情報格納手段にて送信された音声情報を受信して登録するように構成される。
【００２６】
【発明の実施の形態】
以下、本発明の実施の形態を図面に基づいて説明する。
【００２７】
本発明の実施の一形態に係る音声補完方法が適用される移動通信システムは、例えば、図１に示すように構成される。
【００２８】
図１において、この移動通信システムは、例えば、ＰＨＳ方式のシステムであり、移動端末１０（携帯電話機）が無線基地局２０と無線通信を行い、ネットワーク装置３０（例えば、交換局装置）を介して他の端末（例：固定電話６０、移動端末Ｂ７０）との音声通信や非通話通信が行えるようになっている。尚、本例では、発信側の移動端末１０を移動端末Ａ、着信側の移動端末７０を移動端末Ｂと仮定する。
【００２９】
ネットワーク３０装置に接続された音声補完装置４０は、移動端末Ａ１０と固定電話６０と間で通話がなされている場合に、その通話の音声情報を認識し、移動端末Ａ１０ユーザと固定電話ユーザ間の会話の中で頻繁に用いられているフレーズ（複数に分割された音声情報の一つ）を抽出して登録（＝蓄積）する。このとき、音声補完装置４０に登録された上記フレーズは移動端末Ａ１０にも登録（＝登録）される。本発明では、移動端末Ａ１０ユーザが発したフレーズの最初の音節が音声補完装置４０に登録されたフレーズと一致した場合に、音声補完装置４０は、そのフレーズを補完して固定電話６０に流すと共に、移動端末Ａ１０に対し自移動端末Ａ１０に蓄積されている上記フレーズを流させるための指示及び該フレーズを流している間の送信をＯＦＦとする指示を無線基地局２０経由で送出する。
【００３０】
このように本発明では、移動端末Ａ１０ユーザが頻繁に使うフレーズが該ユーザ自身の声で音声補完装置４０と自移動端末Ａ１０の双方に登録される。その後、移動端末Ａ１０ユーザが音声補完装置４０に登録されているフレーズの最初の音節を発したとき、音声補完装置４０は、そのフレーズの音節と一致するフレーズを補完して相手側の固定電話６０あるいは移動端末Ｂ７０に流すようにしている。例えば、本発明を会話の中で定型文句を繰り返すことの多いユーザに適用した場合、該ユーザは定型文句登録のための入力操作を行わなくても容易に該定型文句を相手方に伝えることができるようになる。このとき、ユーザが相手方に伝える際の音声はユーザ自身の声で録音された声で提供されるので、聞き手にとって違和感のない音声を聞くことができる。
【００３１】
次に、本発明の音声補完装置４０の装置構成について説明する。
【００３２】
この音声補完装置４０は、例えば、図２のように構成され、音声認識部４１、ユーザ特定・音声分析処理部４２、バッファメモリ部４３、音声データベース４４、比較・検出部４５、音声再生部４６、中央制御部４７、音声入力部４８、ユーザーインターフェース部４９、音声出力部５０、基地局制御部５１とを具備する。
【００３３】
続いて、音声補完装置４０の動作について図２を参照しながら説明する。
【００３４】
移動端末Ａ１０からの発呼要求に基づいて固定電話６０との間の通話パスが確立され、移動端末Ａ１０ユーザと固定電話６０ユーザとで会話が開始されると、その会話の音声が音声入力部４８に入力される。この音声入力部４８に入力された音声（＝音声信号）は、音声認識部４１で音声認識（例：入力された音声から音素テキスト(発声された文字列)を認識する）され、その音声認識で得られた結果がユーザ特定・音声分析処理部４２に入力される。
【００３５】
ユーザ特定・音声分析処理部４２は、音声認識部４１で得られた認識結果に基づいて移動端末Ａ１０ユーザ（話者）を特定するための分析処理を行うと共に、移動端末Ａ１０ユーザが固定電話６０ユーザとの会話で使っている頻度の高いフレーズを抽出する役割を担う。このユーザ特定・音声分析処理部４２で抽出された出現頻度の高いフレーズは一旦バッファメモリ４３に蓄積され、その蓄積されたフレーズと同一のフレーズが一定回数以上繰り返された場合に、そのフレーズを移動端末Ａ１０ユーザでの利用頻度の高いフレーズとみなして音声データベース４４に登録する。音声データベース４４には、このようにして登録されるフレーズがユーザ毎に分類されている。
【００３６】
図３は、ユーザにて使用頻度の高いフレーズをユーザ毎に登録する音声データベース４４の内部構成例である。
【００３７】
この音声データベース４４は、ユーザを分類するフィールド（▲１▼）と、ユーザの選択・非選択を表すフィールド（▲２▼）と、登録されたフレーズの音声を管理するフィールド（▲３▼）とで構成される。
【００３８】
ユーザを分類するフィールド（▲１▼）には、ユーザを識別することが可能な加入者番号や識別ＩＤ（例：Ｕ１、Ｕ２・・・）などが用いられる。また、ユーザの選択・非選択を表すフィールド（▲２▼）は、移動端末Ａ１０ユーザにて「登録」との選択がなされたフレーズに対して、選択を表すフラグを立て（「１」）、非選択の決定がなされたフレーズに対しては非選択を表すフラグ（「０」）を立てる。登録されたフレーズを管理するフィールド（▲３▼）は、登録したフレーズ毎に番号（登録音声認識番号）を割当て管理する。例えば、Ｖ００１、Ｖ００２・・・などの番号がフレーズごとに割付けられる。
【００３９】
図２に戻って、音声補完装置４０の動作の説明を続ける。
【００４０】
本発明の音声補完装置４０では、移動端末Ａ１０ユーザと固定電話６０ユーザが通話している間（通話中）、音声認識部４１の音声認識機能により通話中の音声の音声情報が常にモニタされる。比較・検出部４５は、その通話中の音声の音声情報が音声データベース４４に登録されたフレーズの最初の音節の音声情報と一致したかどうかを比較し、一致したフレーズを含む音声情報を音声データベース４４から検出する。この比較・検出部４５で検出されたフレーズの音声情報は音声再生部４６で再生された後、音声再生信号となって音声出力部５０から固定電話６０へと送られる。固定電話６０では、音声出力部５０から出力された音声再生信号に基づいて音声の再生を行う。
【００４１】
音声データベース４４に登録されたフレーズの音声情報は、音声再生部４６で音声再生信号となって音声出力部５０から定期的に移動端末Ａ１０に通知され、移動端末Ａ１０では、音声データベース４４に登録されたフレーズを聞くことができる。
【００４２】
中央制御部４７は、基地局制御部５１を介して、移動端末Ａ１０が接続している無線基地局２０に対し移動端末Ａ１０に登録されたフレーズの音声情報を再生させるための指令を送るよう命令する。尚、移動端末Ａ１０でのフレーズの音声情報の登録手順については後述する。また、ユーザーインターフェース部４９はネットワーク装置３０が音声データベース４４にアクセスしたり、そのアクセスに基づいて音声データベース４４から情報を出力したりする際のインターフェース機能を有する。
【００４３】
次に、本発明の移動端末Ａ１０の装置構成を図４を参照しながら説明する。
【００４４】
図４において、この移動端末Ａ１０は、音声補完制御部１１、送受信制御部１２、登録音声メモリ１３、入力・再生部１４、送受信部１５、マイク／スピーカ部１６、操作部１７、アンテナ部１８とを具備して構成される。
【００４５】
続いて、移動端末Ａ１０の動作について図４を参照しながら説明する。
【００４６】
アンテナ部１８は、音声補完装置４０内の音声出力部５０から定期的に通知される無線信号（音声データベース４４に登録されたフレーズの音声情報が含まれる）を無線基地局２０経由で受信した後、送受信部１５に送る。この無線信号は、送受信部１５で周波数変換や復調処理が施された後、フレーズの音声情報が抽出されて入力・再生部１４に送られる。入力・再生部１４から出力されるフレーズの音声情報は操作部１７からの指示によって、登録音声メモリ１３に登録すべきかあるいは非登録とすべきかの選択が行えるようになっている。
【００４７】
図５は、登録音声メモリ１３の内部構成例を示した図である。
【００４８】
この登録音声メモリ１３は、ユーザを認識するフィールド（▲１▼）と、ユーザの選択を表すフィールド（▲２▼）と、登録されたフレーズの音声情報を管理するフィールド（▲３▼）とで構成される。
【００４９】
ユーザを分類するフィールド（▲１▼）には、ユーザを識別するための加入者番号や識別ＩＤ（例：Ｕ１）などが用いられる。また、ユーザの選択を表すフィールド（▲２▼）は、移動端末ユーザにて登録するとの選択がなされたフレーズに対して、選択を表すフラグ（「１」）を立てる。登録されたフレーズの音声情報を管理するフィールド（▲３▼）は、登録したフレーズ毎に番号（登録音声認識番号）を割当て管理する。例えば、Ｖ００１、Ｖ００２・・・などの番号がフレーズごとに割付けられる。
【００５０】
図４に戻って、移動端末Ａ４０の動作の説明を続ける。
【００５１】
送受信制御部１２は、送受信部１５に具備される送信機あるいは受信機のＯＮ／ＯＦＦ制御等を行う。例えば、送受信部１５で無線信号を受信する際には、送受信制御部１２は送受信部１５に対し受信機ＯＮ、送信機ＯＦＦとなるよう制御し、反対に、送受信部１５で無線信号を送信しる際には、送受信部１５に対し送信機ＯＮ、受信機ＯＦＦとなるよう制御する。
【００５２】
送受信部１５の復調で得られたフレーズの音声情報は、入力・再生部１４に入力された後、音声再生されてマイク／スピーカ部１６に音声出力される。このマイク／スピーカ部１６より出力された音声は、その後、操作部１７で「選択」あるいは「非選択」の操作が移動端末Ａ１０ユーザにてなされて、登録音声メモリに登録される。つまり、ここでは、移動端末Ａ１０ユーザがどのフレーズを有効とするかどうかの選択を行う。尚、移動端末Ａ１０ユーザの操作に基づいて該当のフレーズの音声を登録音声メモリ１３に登録する手順については後述する。
【００５３】
上記操作部１７の操作で、「選択」との決定がなされたフレーズの音声については、「選択」されたことを示す通知が送受信部１５から出力され、アンテナ部１８を介して音声補完装置４０に通知される。この「選択」通知を受信した音声補完装置４０内の音声データベース４４は該当するフレーズの音声情報をＯＮ（有効）にする。
【００５４】
また、上記音声補完装置４０の動作の説明で述べたように、無線基地局２０からは、移動端末Ａ１０の登録音声メモリ１３に登録されたフレーズの音声を再生させるための指令が当該移動端末Ａ１０に対して送出される。この指令には、移動端末Ａ１０の送信をＯＦＦにさせるための指令と、該移動端末Ａ１０内の登録音声メモリ１３に登録されたフレーズのうち流すべきフレーズを指定する内容の指令が含まれている。
【００５５】
上記指令を送受信部１５経由で受けた送受信制御部１２は、送受信部１５に対し、送信をＯＦＦにする命令を出す。また、上記指令は、送受信制御部１２から音声補完制御部１１に送られ、該音声補完制御部１１で再生すべきフレーズの音声の登録音声認識番号などを読取ってその結果を入力・再生部１４に伝える。入力・再生部は登録音声メモリ１３にアクセスし、該当する登録音声認識番号のフレーズの音声情報を取得し音声再生する。このようにして入力・再生部１４で再生されるフレーズの音声は、マイク／スピーカ部１６で聞くことができるようになっている。
【００５６】
次に、本発明の音声補完方法による音声補完の処理手順を図６を参照して詳述する。
【００５７】
図６は、本発明の音声補完装置４０による音声補完の処理手順の一例を示すフローチャートである。
【００５８】
図６において、移動端末Ａ１０ユーザ（以下、ユーザＡという）が固定電話６０ユーザ（以下、ユーザＢという）との通話をしているときに発したフレーズは、音声補完装置４０内の音声認識部４１で認識された後、そのフレーズ（以下、フレーズＡという）を発したユーザ（この場合、ユーザＡ）を特定するための分析がユーザ特定・音声分析処理部４２で行われる。このユーザ特定・音声分析処理部４２で得られた分析結果は一旦バッファメモリ４３に蓄積（Ｓ１）される。つまり、ここでは、ユーザＡが発したフレーズＡが一時、バッファメモリ４３に蓄積４３される。
【００５９】
このようにしてバッファメモリ４３に一時蓄積されたユーザＡのフレーズＡは比較・検出部４５に送られ、そのフレーズＡが前回にも使われているか否かが判定（Ｓ２）される。比較・検出部４５は、この判定（Ｓ２）で、バッファメモリ４３から出力されたユーザＡのフレーズＡが前回にも同じフレーズが使われていない判定（Ｓ２でＮＯ）された場合、バッファメモリ４３に対し、フレーズＡの蓄積を保持しておく指示を出すが、該判定（Ｓ２）で、バッファメモリ４３から出力されたユーザＡのフレーズＡが前回にも同じフレーズが使われていると判定（Ｓ２でＹＥＳ）された場合、更に、そのフレーズＡの使われている回数が予め定められた回数（ｎ）以上に達したかどうかの判定（Ｓ３）を行う。例えば、ｎ＝３とした場合、（Ｓ３）の判定でフレーズＡの出現回数Ｃがｎ以下の場合（Ｓ３でＮＯ）、該出現回数Ｃが「１」インクリメント（Ｓ４）される。従って、上記判定（Ｓ３）で、フレーズＡの出現回数Ｃがｎ以上（Ｓ３でＹＥＳ）となったときに次のステップ（Ｓ５）に進む。
【００６０】
上記判定（Ｓ３）で、フレーズＡの出現回数Ｃがｎ以上となったと判定（Ｓ３でＹＥＳ）された場合、比較・検出部４５は、音声データベース４４にアクセスして、そのフレーズＡが既に登録されているかどうかを問合せる（Ｓ５）。この問合せ（Ｓ５）で、フレーズＡが既に登録されているとの応答を音声データベース４４より得た場合（Ｓ５でＹＥＳ）、比較・検出部４５は、音声データベース４４に対し、更に、その登録済みのフレーズＡが移動端末Ａ１０ユーザより有効とする旨の選択がなされているかどうかを問合せ（Ｓ７）を行う。
【００６１】
比較・検出部４５は、その問合せ（Ｓ７）で、フレーズＡの「選択」がなされているとの応答を得た（Ｓ７でＹＥＳ）場合、中央制御部４７を介し基地局制御部５１に移動端末Ａ１０の送信停止要求を表す信号を無線基地局２０に出力するよう指示（Ｓ９）すると共に、音声データベース４４からフレーズＡの音声情報を検出して音声再生部４６に送る。音声再生部４６は、フレーズＡの音声情報を移動端末Ｂ７０で音声再生可能な音声再生信号に変換して音声出力部５０に送る。
この音声出力部５０から出力された音声再生信号は、移動端末Ａ１０の通話相手の（ユーザＢ）固定電話６０に送られ、該固定電話６０では受信した音声再生信号からフレーズＡの音声を聞くことができる（Ｓ１０）ようになっている。
【００６２】
しかしながら、上記判定（Ｓ５）で、フレーズＡが音声データベース４４に登録されていないと判定（Ｓ５でＮＯ）された場合、該フレーズＡは音声データベース４４に登録される。また、上記判定（Ｓ７）で、フレーズＡが移動端末Ａ１０ユーザより選択されていない場合（Ｓ７でＮＯ）は、補完を行わないで処理を中止する。
【００６３】
このように本発明の音声補完方法では、ユーザＡが使う頻度の高いフレーズが音声データベース４４に自動的に登録される。ユーザＡとユーザＢ間の通話は、常にモニタされ、ユーザＡが音声データベース４４に登録されたフレーズＡの最初の音節を発したとき、音声データベース４４に登録されている複数のフレーズからそのフレーズＡの最初の音節と一致するフレーズが抽出される。例えば、音声データベース４４に登録されているフレーズＡを「いつもお世話になっております」とした場合、ユーザＡがフレーズＡの最初の音節「いつも」を発すると、比較・検出部４５は音声データベース４４にアクセスし、「いつも」を最初の音節とするフレーズを音声データベース４４に登録されているフレーズからサーチする。このサーチで、「いつも」を最初の音節とするフレーズが「いつもお世話になっております」しか検出されなければ、このフレーズをフレーズＡと一致したとみなして抽出する。
【００６４】
このようにして抽出されたフレーズＡは、ユーザＢに提供され該ユーザＢでは、ユーザＡの声で「いつもお世話になっております」のフレーズを聞くことができる。しかしながら、「いつも」を最初の音節とするフレーズが複数検出された場合は、次の文字（この場合、「お」）が付加（「いつもお」）されて再度サーチが行われる。このような場合、ユーザＡに対しては、「いつも」を最初の音節とするフレーズが複数検出された旨が通知されるが、ユーザＡはこの通知に基づいて、次の文字を発すれば音声補完装置４０で自動的に該当するフレーズＡを検出してくれるようになっている。
【００６５】
従って、ユーザＡはフレーズの最初の音節を発するだけで相手方に意思伝達できるので、会話の中で定型文句を多く話すユーザにとっては該定型文句を全て話さなくてもよくなり利便性が向上する。また、高齢者や話すことが不自由なユーザであれば、予め伝えたいフレーズを予め登録しておけば、会話をスムーズに進めることも可能となる。
【００６６】
また、本発明の音声補完装置４０は、音声データベース４４で抽出されたフレーズＡの音声をユーザＢに流すと共に、ユーザＡの移動端末Ａ１０に対し、送信をＯＦＦにし、自移動端末Ａ１０に登録されているフレーズＡを流すよう指示する。この指示を無線基地局２０を介して受信した移動端末Ａ１０は、登録されているフレーズＡを流し、その間の送信をＯＦＦにする。即ち、音声補完により補完したフレーズを流している間は移動端末Ａ１０の送信が停止となるので該移動端末Ａ１０の消費電力をより低減することが可能となる。
【００６７】
また、上述の説明では、移動端末Ａ１０の相手方の端末を固定電話６０と仮定したが、本発明はこれに限定されるものでなく相手方の端末が移動端末Ｂ７０であっても勿論よい。
【００６８】
上記実施例では、音声補完装置４０内の音声データベース４４にフレーズを登録する一例として、移動端末Ａ１０ユーザが固定電話６０ユーザと通話中、通話内容が音声認識部４１で音声認識された後、バッファメモリ４３に一時蓄積し、同一フレーズが一定回数以上繰り返されたら、その同一フレーズがユーザの声で音声データベース４４蓄積される場合を例示した。
【００６９】
次に、移動端末Ａ１０にフレーズを登録する方法の一例を以下に示す。
【００７０】
例えば、図７は、移動端末Ａ１０にフレーズを登録する手順の一例を示したフローチャートである。
【００７１】
（登録方法１）
図７において、移動端末Ａ１０は音声補完装置４０から音声データベース４４に登録されたフレーズに係る情報（登録通知）を定期的に受信（Ｓ２１）する。移動端末Ａ１０は、この登録通知を受信すると、所定タイミングで音声補完装置４０の音声データベース４４にアクセスし、該音声データベース４４に登録されたフレーズの音声情報を受信する。移動端末Ａ１０は、そのフレーズの音声情報を受信後、音声再生を行い（Ｓ２３）、その再生されたフレーズのうちどのフレーズを有効とするか否かの決定（選択あるいは非選択）が操作部１７の操作によってなされる。操作部１７の操作で「選択」を表す操作が移動端末Ａ１０ユーザによってなされた場合（Ｓ２４でＹＥＳ）、その「選択」で選ばれたフレーズを自移動端末Ａ１０内の登録音声メモリ１３に記憶すると共に、音声補完装置４０内の音声データベース４４に対して、該フレーズを有効する旨の通知（選択通知）を送出（Ｓ２５）する。一方、操作部１７の操作で「非選択」を表す操作が移動端末Ａ１０ユーザにてなされた場合（Ｓ２４でＮＯ）、「非選択」したフレーズの登録は行われず、かつ音声補完装置４０内の音声データベース４４に対し、「非選択」したフレーズを有効としない旨の通知（非選択通知）を送出する。
続いて、移動端末Ａ１０にフレーズを登録する他の例を図８を参照しながら説明する。
【００７２】
（登録方法２）
図８において、移動端末Ａ１０は音声補完装置４０から音声データベース４４に登録されたフレーズに係る情報（登録通知）を定期的に受信（Ｓ３１）する。移動端末Ａ１０は、この登録通知を受信すると、ｉモード（文字情報サービスの一つ）などを用いて音声補完装置４０内の音声データベース４４にアクセス（Ｓ３２）し、該音声データベース４４に登録されているフレーズの音声情報の一覧をダウンロード（Ｓ３３）する。このダウンロードしたフレーズの音声情報にはそれぞれを識別する番号が割り振られ、移動端末Ａ１０ユーザは、移動端末Ａ１０の画面上で番号をクリック（番号選択）することで該当する番号のフレーズの音声を聞けるようになっている。
【００７３】
このようにして、フレーズの番号の選択がなされ（Ｓ３４）、その選択されたフレーズの音声が再生（Ｓ３５）されると、その再生されたフレーズのうちどのフレーズを有効とするか否かの決定（選択あるいは非選択）が操作部１７の操作によってなされる。操作部１７の操作で「選択」を表す操作が移動端末Ａ１０ユーザによってなされた場合（Ｓ３６でＹＥＳ）、その「選択」で選ばれたフレーズを自移動端末Ａ１０内の登録音声メモリ１３に記憶すると共に、音声補完装置４０内の音声データベース４４に対して、該フレーズを有効する旨の通知（選択通知）を送出（Ｓ３７）する。一方、操作部１７の操作で「非選択」を表す操作が移動端末Ａ１０ユーザにてなされた場合（Ｓ３７でＮＯ）、その「非選択」したフレーズの登録は行われず、かつ音声補完装置４０内の音声データベース４４に対し、「非選択」したフレーズを有効としない旨の通知（非選択通知）を送出する。
【００７４】
尚、上記（登録方法１）及び（登録方法２）において、音声補完装置４０から移動端末Ａ１０に対し定期的に通知されるフレーズに係る情報（登録通知）は、音声情報であっても電子メールのようなテキスト形式になっているものでもかまわない。この場合、音声データベース４４に登録されたフレーズの音声情報が音声再生部４６で音声再生信号となった後、中央制御部４７でその音声信号が読取られ、テキスト形式に変換された後、基地局制御部５１を介して移動端末Ａ１０に送られる。
【００７５】
上記例において、音声補完装置４０内の音声データベース４４の登録機能が音声補完情報格納手段に、比較・検出部４５の音声比較検出機能が音声情報抽出手段に、中央制御部４７の制御機能及び基地局制御部５１の制御信号送出機能が再生指示手段及び音声送信停止指示手段に、ユーザ特定・音声分析処理部４２のユーザ特定・音声分析機能および音声データベース４４の登録機能が音声補完情報自動登録手段に対応する。また、音声出力部４６の音声出力機能が音声情報送信手段及び第1の音声情報送信手段及び第２の音声情報送信手段に対応する。更に、同音声補完装置４０内の中央制御部４７の制御機能及び基地局制御部５１の通知信号送出機能が音声情報通知手段に対応する。
【００７６】
また、更に、移動端末Ａ１０内の登録音声メモリ１３が第1〜３登録手段に対応し、操作部１７の決定機能が音声情報選択手段に、入力・再生部１４の音声再生機能が音声再生手段に対応する。送受信部１５の送信機能が報告手段に、送受信制御部１２の送受信部１５制御機能が送信停止手段に対応する。
【００７７】
【発明の効果】
以上、説明したように、請求項１乃至１０記載の本願発明によれば、ユーザが特によく使うフレーズが音声補完装置に登録され、そのフレーズを登録したユーザが通話中、音声補完装置が登録されたフレーズの最初の音節が一致したことを認識した場合、該音声補完装置はユーザの電話端末が接続している無線基地局に対し、送信をＯＦＦにし、自電話端末に登録された該当フレーズを流す指示を送るよう命令する。そして、無線基地局からその指示を受けた電話端末は、登録された該当フレーズを流し、その間の送信電力をＯＦＦにする。同時に相手方の電話端末へは、該音声補完装置で補完した音声が流れるようになっている。
【００７８】
その結果、ユーザが使用頻度の高いフレーズを発する度に上記のような音声補完がなされて送信電力がＯＦＦとなることから、より電話端末の消費電力の軽減が図れる。また、ユーザでは、電話端末からデータ等の面倒な入力を行う必要がなく、かつ上記音声の補完に用いられる音声はユーザ自身の音声が用いられるため自然な音声補完サービスの提供が実現可能になる。
【００７９】
また、請求項１１乃至１７記載の本願発明によれば、上記のような音声補完方法に従って音声の補完が可能となる音声補完装置を実現することができる。
【００８０】
更に、請求項１８乃至２４記載の本願発明によれば、上記のような音声補完方法に従って音声を再生している間、送信電力を低減することのできる電話端末装置を提供することができる。
【図面の簡単な説明】
【図１】本発明の実施の一形態に係る音声補完方法が適用される移動通信システムの構成例を示す図である。
【図２】図１に示す移動通信システムにおいて音声補完装置のブロック図を示す図である。
【図３】使用頻度の高いフレーズをユーザ毎に登録する音声データベースの内部構成例を示す図である。
【図４】図１に示す移動通信システムにおいて移動端末Ａのブロック図を示す図である。図である。
【図５】ユーザにて選択したフレーズが登録される移動端末の登録音声メモリの内部構成例を示す図である。
【図６】本発明の音声補完装置による音声補完の処理手順の一例を示すフローチャートである。
【図７】移動端末にフレーズを登録する手順の一例（その１）を示す図である。
【図８】移動端末にフレーズを登録する手順の一例（その２）を示す図である。
【符号の説明】
１０移動端末Ａ
１１音声補完制御部
１２送受信制御部
１３登録音声メモリ
１４入力・再生部
１５送受信部
１６マイク／スピーカ部
１７操作部
１８アンテナ部
２０無線基地局
３０ネットワーク装置
４０音声補完装置
４１音声認識部
４２ユーザ特定・音声分析処理部
４３バッファメモリ部
４４音声データベース
４５比較・検出部
４６音声再生部
４７中央制御部
４８音声入力部
４９ユーザーインターフェース部
５０音声出力部
５１基地局制御部
６０固定電話
７０移動端末Ｂ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a voice complementing method and a voice complementing apparatus, and more particularly, a voice complementing method and a voice complementing apparatus capable of saving power of a telephone terminal by complementing a voice of a telephone terminal user during a voice call, and Regarding telephone terminals.
[0002]
[Prior art]
Current cellular mobile communication systems include car phones, mobile phones (eg, PDC (Personal Digital Cellular) mobile communication systems), simple mobile phones (eg, PHS (Personal Handy phone System)), and the like. When using a communication system, a telephone number of a partner is input from a mobile terminal (eg, a mobile phone), and after connecting to the partner, voice or data is transmitted to the partner.
[0003]
When a user uses a mobile terminal, an input operation from the user is required. For example, (1) when inputting a telephone number or registering a telephone book, (2) character data such as e-mail. (3) When reading received character data, (4) When making a call with the other party. Conventionally, the following techniques have been proposed as a method for assisting these inputs (1) to (4).
[0004]
1. Japanese Patent Laid-Open No. 8-23369 has been proposed as a technique for assisting (corresponding to (1) above) a terminal operation that enables operation of the telephone device by voice input. Japanese Patent Laid-Open No. 2000-78267 discloses a technique that enables connection by generating a code using a “word code” instead of pressing a line number with a button. Further, Japanese Patent Application No. 10-369778 discloses a technique capable of extracting a telephone number from a voice uttered by a partner and registering it in a telephone directory.
[0005]
2. Text data of e-mail stored in the portable terminal is supplemented with character data creation assistance (corresponding to (2) above) and synthetic data is used to inform the user of the morphological information terminal, and received data reading assistance (above (3) above) Japanese Patent Laid-Open No. 9-32184 is disclosed as a technique relating to (1).
[0006]
3. Japanese Patent Laid-Open No. 9-32184 discloses a technique for inputting contents to be transmitted to a telephone terminal as a technique for assisting a call (corresponding to the above (4)), and converting the data into a synthesized voice signal and sending it to a telephone line. Yes. Japanese Patent Application Laid-Open No. 6-131583 discloses a technique for automatically connecting to a telephone line when a user generates a specific voice (word) and outputting a voice stored in advance.
[0007]
[Problems to be solved by the invention]
As described above, in the conventional method, there is only a method of changing data input by the user into voice. In this case, in order to convey what the user wants to convey, it is necessary to perform a key operation and input characters, so there is a problem that it takes time for the key operation and forces the user to perform a troublesome operation. In addition, when using synthesized speech (speech synthesis that makes a computer or the like speak), problems remain in prosody (accent and intonation) control and intelligibility, and it produces speech that is very close to humans. It is difficult to do at present. Therefore, in the conventional method, it is different from the voice of the user himself / herself.
[0008]
Therefore, a first problem of the present invention is to provide a voice complementing method, a voice complementing apparatus, and a telephone terminal device that can perform voice complementing with the user's own voice without imposing a complicated input operation on the user. is there.
[0009]
[Means for Solving the Problems]
In order to solve the first problem, as described in claim 1, the present invention supplements voice information when a telephone terminal performs voice communication with another telephone terminal via a predetermined communication network. In the voice complementing method, voice information from the telephone terminal is registered in advance in voice supplement information storing means connected to a predetermined communication network, and when the telephone terminal and the other telephone terminal are performing voice communication, When voice is input, voice information including voice information transmitted from the telephone terminal is extracted from voice information registered in the voice complementary information storage means, and the extracted voice information is extracted from the other telephone terminal. Configured to send to.
[0010]
In such a voice complementing method, when a user of a telephone terminal (hereinafter referred to as user A) is talking to a user of another telephone terminal (hereinafter referred to as user B), a voice including voice information during the call is included. The information is extracted from the voice information registered in the voice supplement information storage means connected to the predetermined network, and the extracted voice information is transmitted to the user B's telephone terminal.
[0011]
For example, when the voice information “Oha” is transmitted from the terminal of user A, the voice supplement information storage means receives the voice information “Good morning” including “Oha” from the voice information registered in advance. Extract. Since the voice of “Good morning” extracted in this way is transmitted to the user B's telephone terminal as the user's own voice, the user B can hear it without feeling uncomfortable. In other words, according to the present invention, the voice information “you are”, which is a difference from the voice information “oh” issued from the user A, is complemented. As a result, it is possible to support smooth conversation between users.
[0012]
The voice supplement information storage means in the present invention may be provided as a network device of a communication carrier connected to a predetermined network, or as a new node, and does not limit the installation location.
[0013]
From the viewpoint that voice information registered in the voice supplement information storage means can be registered in a telephone terminal as well, the present invention provides the voice complement method as described in claim 2, in the voice complement method. Information storage means Registered Voice information is transmitted to the telephone terminal, and the telephone terminal is configured to receive and register the voice information.
[0014]
From the viewpoint that voice information that is frequently used out of voice information emitted from a user can be extracted and automatically registered, the present invention provides the voice complement information storage in the voice complement method as described in claim 3. The means is configured to extract and automatically register voice information having a high appearance frequency from voice information transmitted from the user when the telephone terminal and the other telephone terminal are performing voice communication.
[0015]
In addition, from the viewpoint that voice information frequently used by the user can be registered in a telephone terminal, the present invention provides the voice complementing method according to claim 4, wherein the telephone terminal Voice information with a high appearance frequency extracted by the complementary information storage means is received from the voice supplement information storage means and registered.
[0016]
From the viewpoint that the user can freely select and register the voice to be registered in the voice supplement information storage means, the present invention provides the voice supplement method according to the fifth aspect of the present invention. The complementary information storage means stores either voice information or voice information with a high frequency of appearance. Accumulated, accumulated After notifying the telephone terminal of voice information or information for transmitting the voice information, the telephone terminal selects voice information to be registered in the voice complementary information storage unit based on the notification. The selection result is reported to the speech complement information storage means, and the speech complement information storage means is configured to register speech information based on the report from the wireless terminal.
[0017]
Further, according to the present invention, in the voice complementing method according to the sixth aspect, the telephone terminal selects the voice information to be registered in the voice complement information storage unit after the user selects the voice information. The audio information obtained based on the result is registered.
[0018]
From the viewpoint that it is possible to reproduce the same audio information as the audio information supplemented by the audio complement information storage means, the present invention provides the audio complement method according to claim 7, wherein When a user's voice is input during voice communication between the telephone terminal and the other telephone terminal, the voice supplement information storage means stores the voice information including the voice information transmitted from the telephone terminal. Extracted from the speech information registered in the complementary information storage means, and sends a signal to the telephone terminal as an instruction to reproduce the same voice information as the extracted voice information, and the telephone terminal Is configured to reproduce pre-registered audio information.
[0019]
In such a speech complementing method, the speech supplementation information storage means does not transmit the speech information extracted from the speech information emitted by the user A to the user A, but the extracted speech information is transmitted to the user A's telephone terminal. An instruction for reading out and reproducing from previously registered audio information is sent. That is, since the voice supplement information storage means only transmits a signal serving as the instruction to the telephone terminal of user A to the telephone terminal, it is possible to save radio resources related to voice information transmission.
[0020]
From the viewpoint that the power consumption of the telephone terminal can be further reduced, according to the present invention, as described in claim 8, in the speech complementation method, the speech complementation information storage unit includes the speech complementation information storage unit. When transmitting an instruction for reproducing the same audio information as the audio information extracted in step (b) to the telephone terminal, the telephone terminal reproduces the audio information based on the instruction from the audio complementary information storage unit. During the period, the voice terminal transmits an instruction to stop the transmission of voice information to the telephone terminal, and the telephone terminal is involved in voice input from the user while reproducing the voice information registered in advance according to the instruction. First, it is configured to stop voice transmission.
[0021]
In such a voice complementing method, while the telephone terminal is playing back the registered voice, the voice transmission output is stopped regardless of the voice input from the user, so that the power consumption of the telephone terminal is reduced. Is possible.
[0022]
According to a ninth aspect of the present invention, there is provided a voice complement method for complementing voice information when a telephone terminal performs voice communication with another telephone terminal via a predetermined communication network. Voice information is registered in advance in voice supplement information storage means connected to a predetermined communication network, and when the user's voice is input while the telephone terminal and the other telephone terminal are performing voice communication, Voice information including voice information transmitted from the telephone terminal is extracted from the voice information registered in the voice complementary information storage means, and the extracted voice information is transmitted to the telephone terminal and the other telephone terminal. Composed.
[0023]
Furthermore, from the viewpoint that a mobile terminal (eg, a mobile phone) can be used as the telephone terminal, the present invention provides a predetermined communication as the telephone terminal in the voice complementing method as described in claim 10. A mobile terminal device connected to the network is used.
[0024]
In order to solve the above-mentioned problem, the present invention provides, as described in claim 11, a voice that complements voice information when a telephone terminal performs voice communication with another telephone terminal via a predetermined communication network. In the complementing device, when the voice supplementary information storage means that is connected to a predetermined communication network and registers the voice information from the telephone terminal in advance and the telephone terminal and the other telephone terminal perform voice communication, the user's When voice is input, voice information extracting means for extracting voice information including voice information transmitted from the telephone terminal from voice information registered in the voice complementary information storage means, and the extracted voice information Voice information transmitting means for transmitting to the other telephone terminal.
[0025]
Furthermore, in order to solve the above-mentioned problems, the present invention provides a communication system as defined in claim 18 via a predetermined communication network. With other phone terminals In a telephone terminal device that performs communication,
The predetermined communication network is connected with a voice complementing device that complements voice information when the telephone terminal performs voice communication. The voice complementing device stores voice information from the telephone terminal. Pre-registered voice information including voice information transmitted from the telephone terminal is extracted from the voice information registered in the voice complementary information storage means, and the extracted voice information is transmitted to the other telephone terminal, The telephone terminal stores the voice supplement information. In steps The received voice information is received and registered.
[0026]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0027]
A mobile communication system to which a speech supplementing method according to an embodiment of the present invention is applied is configured as shown in FIG. 1, for example.
[0028]
In FIG. 1, this mobile communication system is, for example, a PHS system, in which a mobile terminal 10 (mobile phone) performs radio communication with a radio base station 20, and via a network device 30 (for example, an exchange station device). Voice communication and non-call communication can be performed with other terminals (for example, fixed telephone 60, mobile terminal B70). In this example, it is assumed that the mobile terminal 10 on the sending side is the mobile terminal A and the mobile terminal 70 on the receiving side is the mobile terminal B.
[0029]
When a call is made between the mobile terminal A10 and the landline telephone 60, the voice complementing apparatus 40 connected to the network 30 apparatus recognizes the sound information of the call, and between the mobile terminal A10 user and the landline telephone user. Phrases frequently used in conversation (one of voice information divided into a plurality of pieces) are extracted and registered (= accumulated). At this time, the phrase registered in the speech complementing apparatus 40 is also registered (= registered) in the mobile terminal A10. In the present invention, when the first syllable of the phrase uttered by the user of the mobile terminal A10 matches the phrase registered in the speech complementing device 40, the speech complementing device 40 complements the phrase and sends it to the fixed telephone 60. Then, the mobile terminal A10 sends an instruction for flowing the phrase stored in the mobile terminal A10 and an instruction to turn off transmission while the phrase is being flowed through the radio base station 20.
[0030]
As described above, in the present invention, phrases frequently used by the mobile terminal A10 user are registered in both the voice complementing device 40 and the own mobile terminal A10 with the voice of the user himself / herself. Thereafter, when the user of the mobile terminal A10 utters the first syllable of the phrase registered in the speech complementing device 40, the speech complementing device 40 complements the phrase that matches the syllable of the phrase, and the other party's landline phone 60 Or it is made to flow to mobile terminal B70. For example, when the present invention is applied to a user who frequently repeats a fixed phrase in a conversation, the user can easily convey the fixed phrase to the other party without performing an input operation for registering the fixed phrase. It becomes like this. At this time, since the voice when the user conveys to the other party is provided as a voice recorded in the user's own voice, it is possible to hear a voice that is comfortable for the listener.
[0031]
Next, the device configuration of the speech complementing device 40 of the present invention will be described.
[0032]
The voice complementing device 40 is configured as shown in FIG. 2, for example, and includes a voice recognition unit 41, a user identification / voice analysis processing unit 42, a buffer memory unit 43, a voice database 44, a comparison / detection unit 45, and a voice playback unit 46. A central control unit 47, a voice input unit 48, a user interface unit 49, a voice output unit 50, and a base station control unit 51.
[0033]
Next, the operation of the speech complementing apparatus 40 will be described with reference to FIG.
[0034]
When a call path between the fixed terminal 60 is established based on a call request from the mobile terminal A10 and a conversation is started between the user of the mobile terminal A10 and the fixed telephone 60 user, the voice of the conversation is input to the voice input unit. 48 is input. The speech (= speech signal) input to the speech input unit 48 is speech-recognized (for example, phoneme text (uttered character string) is recognized from the input speech) by the speech recognition unit 41, and the speech recognition is performed. The result obtained in the above is input to the user identification / voice analysis processing unit 42.
[0035]
The user identification / speech analysis processing unit 42 performs analysis processing for identifying the mobile terminal A10 user (speaker) based on the recognition result obtained by the speech recognition unit 41, and the mobile terminal A10 user receives the fixed phone 60. Plays a role of extracting frequently used phrases in conversations with users. Phrases with a high appearance frequency extracted by the user identification / speech analysis processing unit 42 are temporarily stored in the buffer memory 43. When the same phrases as the stored phrases are repeated a predetermined number of times, the phrases are moved. The phrase is registered in the voice database 44 as a phrase frequently used by the terminal A10 user. In the voice database 44, the phrases registered in this way are classified for each user.
[0036]
FIG. 3 is an internal configuration example of the voice database 44 in which phrases frequently used by the user are registered for each user.
[0037]
The voice database 44 includes a field (1) for classifying users, a field (2) indicating user selection / non-selection, and a field (3) for managing voices of registered phrases. Consists of.
[0038]
In the field (1) for classifying the user, a subscriber number or identification ID (eg, U1, U2,...) That can identify the user is used. In the field ((2)) indicating the selection / non-selection of the user, a flag indicating selection is set for the phrase selected as “register” by the user of the mobile terminal A10 (“1”), A flag (“0”) indicating non-selection is set for a phrase for which non-selection is determined. The field (3) for managing registered phrases assigns and manages a number (registered speech recognition number) for each registered phrase. For example, numbers such as V001, V002... Are assigned for each phrase.
[0039]
Returning to FIG. 2, the description of the operation of the speech complementing apparatus 40 will be continued.
[0040]
In the voice complementing device 40 of the present invention, while the mobile terminal A10 user and the landline telephone 60 user are talking (during a call), the voice recognition function of the voice recognition unit 41 always monitors the voice information of the voice during the call. . The comparison / detection unit 45 compares whether or not the voice information of the voice during the call matches the voice information of the first syllable of the phrase registered in the voice database 44, and the voice information including the matched phrase is stored in the voice database. 44. The audio information of the phrase detected by the comparison / detection unit 45 is reproduced by the audio reproduction unit 46 and then transmitted as an audio reproduction signal from the audio output unit 50 to the fixed telephone 60. The fixed telephone 60 reproduces sound based on the sound reproduction signal output from the sound output unit 50.
[0041]
The audio information of the phrase registered in the audio database 44 is periodically notified to the mobile terminal A10 from the audio output unit 50 as an audio reproduction signal in the audio reproduction unit 46, and is registered in the audio database 44 in the mobile terminal A10. Can hear the phrase.
[0042]
The central control unit 47 commands the base station control unit 51 to send a command for reproducing the voice information of the phrase registered in the mobile terminal A10 to the radio base station 20 to which the mobile terminal A10 is connected. To do. In addition, the registration procedure of the audio | voice information of the phrase in mobile terminal A10 is mentioned later. The user interface unit 49 has an interface function when the network device 30 accesses the voice database 44 and outputs information from the voice database 44 based on the access.
[0043]
Next, the apparatus configuration of the mobile terminal A10 of the present invention will be described with reference to FIG.
[0044]
In FIG. 4, the mobile terminal A 10 includes a voice complement control unit 11, a transmission / reception control unit 12, a registered voice memory 13, an input / playback unit 14, a transmission / reception unit 15, a microphone / speaker unit 16, an operation unit 17, and an antenna unit 18. It is comprised and comprises.
[0045]
Next, the operation of the mobile terminal A10 will be described with reference to FIG.
[0046]
The antenna unit 18 receives a wireless signal (including phrase audio information registered in the audio database 44) periodically notified from the audio output unit 50 in the audio complementing device 40 via the radio base station 20. And sent to the transceiver 15. This radio signal is subjected to frequency conversion and demodulation processing in the transmission / reception unit 15, and then the audio information of the phrase is extracted and sent to the input / reproduction unit 14. The voice information of the phrase output from the input / playback unit 14 can be selected according to an instruction from the operation unit 17 to be registered in the registered voice memory 13 or unregistered.
[0047]
FIG. 5 is a diagram illustrating an internal configuration example of the registered voice memory 13.
[0048]
The registered voice memory 13 includes a field for recognizing a user (1), a field for selecting a user (2), and a field for managing voice information of a registered phrase (3). Composed.
[0049]
In the field (1) for classifying a user, a subscriber number for identifying the user, an identification ID (for example, U1) or the like is used. Also, the field ({circle around (2)}) representing the user's selection sets a flag (“1”) representing the selection for the phrase selected to be registered by the mobile terminal user. In the field ((3)) for managing the voice information of the registered phrase, a number (registered voice recognition number) is assigned and managed for each registered phrase. For example, numbers such as V001, V002... Are assigned for each phrase.
[0050]
Returning to FIG. 4, the description of the operation of the mobile terminal A40 will be continued.
[0051]
The transmission / reception control unit 12 performs ON / OFF control of a transmitter or a receiver included in the transmission / reception unit 15. For example, when the transmission / reception unit 15 receives a radio signal, the transmission / reception control unit 12 controls the transmission / reception unit 15 so that the receiver is ON and the transmitter is OFF, and conversely, the transmission / reception unit 15 transmits the radio signal. In this case, the transmitter / receiver 15 is controlled so that the transmitter is ON and the receiver is OFF.
[0052]
The audio information of the phrase obtained by the demodulation of the transmission / reception unit 15 is input to the input / reproduction unit 14, reproduced as audio, and output to the microphone / speaker unit 16. The voice output from the microphone / speaker unit 16 is then registered in the registered voice memory by the user of the mobile terminal A10 performing a “select” or “non-select” operation on the operation unit 17. That is, here, the mobile terminal A10 user selects which phrases are valid. The procedure for registering the voice of the corresponding phrase in the registered voice memory 13 based on the operation of the mobile terminal A10 user will be described later.
[0053]
With respect to the voice of the phrase that has been determined to be “selected” by the operation of the operation unit 17, a notification indicating that it has been “selected” is output from the transmission / reception unit 15, and the voice complement device 40 is transmitted via the antenna unit 18. Will be notified. The voice database 44 in the voice complementing apparatus 40 that has received this “selection” notification turns on (validates) the voice information of the corresponding phrase.
[0054]
Further, as described in the explanation of the operation of the speech complementing apparatus 40, the radio base station 20 issues a command for reproducing the speech of the phrase registered in the registered speech memory 13 of the mobile terminal A10. Is sent out. This command includes a command for turning off transmission of the mobile terminal A10 and a command for specifying a phrase to be flowed among phrases registered in the registered voice memory 13 in the mobile terminal A10. .
[0055]
The transmission / reception control unit 12 that has received the command via the transmission / reception unit 15 issues a command to turn off transmission to the transmission / reception unit 15. The command is sent from the transmission / reception control unit 12 to the speech complement control unit 11, and the speech supplement control unit 11 reads the registered speech recognition number of the speech of the phrase to be reproduced and inputs the result to the input / reproduction unit 14. To tell. The input / playback unit accesses the registered voice memory 13, acquires the voice information of the corresponding registered voice recognition number phrase, and plays back the voice information. Thus, the sound of the phrase reproduced by the input / reproduction unit 14 can be heard by the microphone / speaker unit 16.
[0056]
Next, the processing procedure of speech complementing by the speech complementing method of the present invention will be described in detail with reference to FIG.
[0057]
FIG. 6 is a flowchart showing an example of a voice complement processing procedure by the voice complementing apparatus 40 of the present invention.
[0058]
In FIG. 6, a phrase issued when a mobile terminal A 10 user (hereinafter referred to as user A) is talking to a landline 60 user (hereinafter referred to as user B) is a speech recognition unit in the speech complementing device 40. After being recognized at 41, the user specifying / speech analysis processing unit 42 performs analysis for specifying the user (in this case, user A) who has issued the phrase (hereinafter referred to as phrase A). The analysis result obtained by the user identification / voice analysis processing unit 42 is temporarily stored in the buffer memory 43 (S1). That is, here, the phrase A issued by the user A is temporarily stored 43 in the buffer memory 43.
[0059]
The phrase A of the user A temporarily stored in the buffer memory 43 in this manner is sent to the comparison / detection unit 45, and it is determined whether or not the phrase A has been used before (S2). In this determination (S2), the comparison / detection unit 45 determines that the phrase A of the user A output from the buffer memory 43 has not been used in the previous time (NO in S2). In response to the instruction to keep the accumulation of the phrase A, it is determined in the determination (S2) that the phrase A of the user A output from the buffer memory 43 is the same as the previous phrase ( If YES in S2, a determination is made as to whether or not the number of times the phrase A has been used has reached a predetermined number (n) or more (S3). For example, when n = 3, when the number of appearances C of the phrase A is n or less (NO in S3) in the determination of (S3), the number of appearances C is incremented by “1” (S4). Accordingly, when the number of appearances C of the phrase A is n or more (YES in S3) in the determination (S3), the process proceeds to the next step (S5).
[0060]
If it is determined in the above determination (S3) that the number of appearances C of the phrase A is n or more (YES in S3), the comparison / detection unit 45 accesses the voice database 44 and the phrase A is already registered. An inquiry is made as to whether it has been performed (S5). In response to this inquiry (S5), when the response that the phrase A has already been registered is obtained from the voice database 44 (YES in S5), the comparison / detection unit 45 further registers the response to the voice database 44. An inquiry is made as to whether or not the phrase A is valid from the user of the mobile terminal A10 (S7).
[0061]
The comparison / detection unit 45 moves to the base station control unit 51 via the central control unit 47 when receiving a response that “selection” of the phrase A is made in the inquiry (S7) (YES in S7). The terminal A10 is instructed to output a signal indicating a transmission stop request to the radio base station 20 (S9), and the voice information of the phrase A is detected from the voice database 44 and sent to the voice playback unit 46. The audio reproduction unit 46 converts the audio information of the phrase A into an audio reproduction signal that can be reproduced by the mobile terminal B 70 and sends the audio reproduction signal to the audio output unit 50.
The audio reproduction signal output from the audio output unit 50 is sent to the (user B) landline phone 60 of the other party of the mobile terminal A10, and the landline phone 60 listens to the sound of the phrase A from the received sound reproduction signal. (S10).
[0062]
However, when it is determined in the determination (S5) that the phrase A is not registered in the voice database 44 (NO in S5), the phrase A is registered in the voice database 44. If the phrase A is not selected by the mobile terminal A10 user in the determination (S7) (NO in S7), the process is stopped without performing complementation.
[0063]
Thus, in the speech complementing method of the present invention, phrases frequently used by the user A are automatically registered in the speech database 44. The call between the user A and the user B is constantly monitored, and when the user A utters the first syllable of the phrase A registered in the voice database 44, the phrase A is extracted from a plurality of phrases registered in the voice database 44. The phrase that matches the first syllable of is extracted. For example, if the phrase A registered in the voice database 44 is “always indebted”, when the user A utters the first syllable “always” of the phrase A, the comparison / detection unit 45 causes the voice database 44, and searches the phrase registered in the voice database 44 for a phrase having “always” as the first syllable. In this search, if only the phrase “always” is found as the first phrase with “always” as the first syllable, this phrase is extracted as if it matched phrase A.
[0064]
The phrase A thus extracted is provided to the user B, and the user B can hear the phrase “I am always indebted” with the voice of the user A. However, when a plurality of phrases having “always” as the first syllable are detected, the next character (in this case, “o”) is added (“always”) and the search is performed again. In such a case, the user A is notified that a plurality of phrases having “always” as the first syllable have been detected. However, if the user A utters the next character based on the notification, The speech complement device 40 automatically detects the corresponding phrase A.
[0065]
Therefore, since the user A can communicate with the other party only by uttering the first syllable of the phrase, it is not necessary to speak all the fixed phrases in the conversation, and convenience is improved. Moreover, if it is an elderly person or a user with difficulty in speaking, if the phrase to convey beforehand is registered beforehand, it will also be possible to advance a conversation smoothly.
[0066]
In addition, the speech complementing apparatus 40 of the present invention sends the speech of the phrase A extracted in the speech database 44 to the user B, turns off transmission to the mobile terminal A10 of the user A, and is registered in the mobile terminal A10. Is instructed to play the phrase A. The mobile terminal A10 that has received this instruction via the radio base station 20 plays the registered phrase A and turns off transmission during that time. That is, since the transmission of the mobile terminal A10 is stopped while the phrase supplemented by the speech complement is being played, the power consumption of the mobile terminal A10 can be further reduced.
[0067]
In the above description, the counterpart terminal of the mobile terminal A10 is assumed to be the fixed telephone 60. However, the present invention is not limited to this, and the counterpart terminal may of course be the mobile terminal B70.
[0068]
In the above embodiment, as an example of registering a phrase in the voice database 44 in the voice complementing device 40, the mobile terminal A10 user is talking to the landline 60 user, and after the call content is voice-recognized by the voice recognition unit 41, the buffer A case where the same phrase is stored in the memory 43 and the same phrase is repeated more than a predetermined number of times is stored in the voice database 44 by the voice of the user.
[0069]
Next, an example of a method for registering a phrase in the mobile terminal A10 is shown below.
[0070]
For example, FIG. 7 is a flowchart showing an example of a procedure for registering a phrase in the mobile terminal A10.
[0071]
(Registration method 1)
In FIG. 7, the mobile terminal A10 periodically receives information (registration notification) related to the phrase registered in the speech database 44 from the speech complement device 40 (S21). When receiving the registration notification, the mobile terminal A10 accesses the speech database 44 of the speech complementing device 40 at a predetermined timing, and receives the speech information of the phrase registered in the speech database 44. The mobile terminal A10 performs voice reproduction after receiving the voice information of the phrase (S23), and the operation unit 17 determines which phrase is valid (selected or not selected) among the reproduced phrases. It is made by the operation of. When the operation indicating the “selection” is performed by the user of the mobile terminal A10 by the operation of the operation unit 17 (YES in S24), the phrase selected by the “selection” is stored in the registered voice memory 13 in the own mobile terminal A10. At the same time, a notification (selection notification) indicating that the phrase is valid is sent to the speech database 44 in the speech complementing device 40 (S25). On the other hand, when the operation of the operation unit 17 is performed by the user of the mobile terminal A10 (NO in S24), the “non-selected” phrase is not registered, and the speech complementing device 40 has no registration. A notification (non-selection notification) indicating that the phrase “not selected” is not valid is sent to the voice database 44.
Next, another example of registering a phrase in the mobile terminal A10 will be described with reference to FIG.
[0072]
(Registration method 2)
In FIG. 8, the mobile terminal A10 periodically receives information (registration notification) related to the phrase registered in the voice database 44 from the voice complement device 40 (S31). When receiving the registration notification, the mobile terminal A10 accesses the speech database 44 in the speech complementing apparatus 40 using i-mode (one of character information services) or the like (S32), and is registered in the speech database 44. A list of voice information of the phrase is downloaded (S33). The downloaded phrase audio information is assigned a number for identifying it, and the user of the mobile terminal A10 can listen to the audio of the phrase of the corresponding number by clicking (number selection) on the screen of the mobile terminal A10. It is like that.
[0073]
Thus, when the number of the phrase is selected (S34) and the sound of the selected phrase is reproduced (S35), it is determined which of the reproduced phrases is valid. (Selection or non-selection) is performed by operating the operation unit 17. When an operation indicating “selection” is performed by the user of the mobile terminal A10 by the operation of the operation unit 17 (YES in S36), the phrase selected by the “selection” is stored in the registered voice memory 13 in the own mobile terminal A10. At the same time, a notification (selection notification) indicating that the phrase is valid is sent to the speech database 44 in the speech complementing device 40 (S37). On the other hand, when an operation indicating “non-selection” is performed by the user of the mobile terminal A10 by the operation of the operation unit 17 (NO in S37), the “non-selection” phrase is not registered and the speech complementing device 40 A notification (non-selection notification) indicating that the “non-selected” phrase is not valid is sent to the voice database 44.
[0074]
In the above (Registration method 1) and (Registration method 2), the information (registration notification) related to the phrase periodically notified from the speech complementing apparatus 40 to the mobile terminal A10 is voice information even if it is voice information. It may be a text format such as In this case, after the voice information of the phrase registered in the voice database 44 is converted into a voice playback signal by the voice playback unit 46, the voice signal is read by the central control unit 47 and converted into a text format. It is sent to the mobile terminal A10 via the control unit 51.
[0075]
In the above example, the registration function of the voice database 44 in the voice complementing device 40 is the voice complement information storing means, the voice comparison and detection function of the comparison / detection section 45 is the voice information extraction means, the control function of the central control section 47 and the base The control signal transmission function of the station control unit 51 is the reproduction instruction unit and the voice transmission stop instruction unit, and the user identification / speech analysis function of the user identification / speech analysis processing unit 42 and the registration function of the speech database 44 are speech complementary information automatic registration units. Corresponding to The voice output function of the voice output unit 46 corresponds to the voice information transmitting means, the first voice information transmitting means, and the second voice information transmitting means. Further, the control function of the central control unit 47 and the notification signal transmission function of the base station control unit 51 in the voice complementing apparatus 40 correspond to the voice information notification means.
[0076]
Furthermore, the registered voice memory 13 in the mobile terminal A10 corresponds to the first to third registration means, the determination function of the operation unit 17 is the voice information selection means, and the voice playback function of the input / playback unit 14 is the voice playback means. Corresponding to The transmission function of the transmission / reception unit 15 corresponds to the reporting unit, and the transmission / reception unit 15 control function of the transmission / reception control unit 12 corresponds to the transmission stop unit.
[0077]
【The invention's effect】
As described above, according to the present invention described in claims 1 to 10, a phrase that is frequently used by a user is registered in the speech complementing apparatus, and the speech complementing apparatus is registered while the user who registered the phrase is talking. When the speech complementing device recognizes that the first syllable of the phrase matches, the speech complementing apparatus turns off transmission to the radio base station to which the user's telephone terminal is connected, and the corresponding phrase registered in the own telephone terminal is Command to send instructions to flow. Then, the telephone terminal that has received the instruction from the radio base station plays the registered phrase and turns off the transmission power during that time. At the same time, the voice complemented by the voice complementing device flows to the other party's telephone terminal.
[0078]
As a result, every time a user issues a frequently used phrase, the above-described speech compensation is performed and the transmission power is turned off. Therefore, the power consumption of the telephone terminal can be further reduced. In addition, the user does not need to perform troublesome input of data or the like from the telephone terminal, and the user's own voice is used as the voice to be supplemented with the voice, so that a natural voice supplement service can be provided. .
[0079]
Further, according to the present invention described in claims 11 to 17, it is possible to realize a speech complementing apparatus that can complement speech according to the speech complementing method as described above.
[0080]
Furthermore, according to the present invention of claims 18 to 24, it is possible to provide a telephone terminal device capable of reducing the transmission power while reproducing the voice according to the voice complementing method as described above.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating a configuration example of a mobile communication system to which a speech complementation method according to an embodiment of the present invention is applied.
FIG. 2 is a block diagram of a speech supplement apparatus in the mobile communication system shown in FIG.
FIG. 3 is a diagram showing an internal configuration example of a voice database that registers frequently used phrases for each user;
4 is a block diagram of mobile terminal A in the mobile communication system shown in FIG. FIG.
FIG. 5 is a diagram showing an internal configuration example of a registered voice memory of a mobile terminal in which a phrase selected by a user is registered.
FIG. 6 is a flowchart showing an example of a voice complement processing procedure by the voice complement device of the present invention.
FIG. 7 is a diagram illustrating an example of a procedure for registering a phrase in a mobile terminal (part 1);
FIG. 8 is a diagram illustrating an example (part 2) of a procedure for registering a phrase in a mobile terminal;
[Explanation of symbols]
10 Mobile terminal A
11 Speech Complementation Control Unit
12 Transmission / reception controller
13 Registered voice memory
14 Input and playback unit
15 Transceiver
16 Microphone / speaker section
17 Operation unit
18 Antenna section
20 radio base stations
30 Network equipment
40 Voice Complementary Device
41 Voice recognition unit
42 User identification / voice analysis processing unit
43 Buffer memory section
44 voice database
45 Comparison / detection unit
46 Audio playback unit
47 Central control unit
48 Voice input part
49 User interface
50 Audio output section
51 Base station controller
60 Landline
70 Mobile terminal B

Claims

In the voice complementing method for complementing voice information when a telephone terminal performs voice communication with another telephone terminal via a predetermined communication network,
Voice information from the telephone terminal is registered in advance in voice supplement information storage means connected to a predetermined communication network,
When the user's voice is input while the telephone terminal and the other telephone terminal are performing voice communication, the voice information including the voice information transmitted from the telephone terminal is registered in the voice complementary information storage unit. Extracted from the audio information,
A speech complementing method of transmitting the extracted speech information to the other telephone terminal.

The speech completion method according to claim 1,
The voice supplement information storage means transmits the registered voice information to the telephone terminal,
An audio complementing method in which the telephone terminal receives and registers the audio information.

The speech completion method according to claim 1,
The voice complement information storage means extracts voice information that is frequently registered from voice information transmitted from the user and automatically registers when the telephone terminal and the other telephone terminal are performing voice communication. Completion method.

The speech completion method according to claim 3,
A speech complementing method in which the telephone terminal receives and registers speech information having a high appearance frequency extracted by the speech complementation information storage unit from the speech supplementation information storage unit.

The speech complementing method according to claim 2 or 4,
The voice complementary information storage means accumulates either voice information or voice information with high appearance frequency, notifies the telephone terminal of the stored voice information or information for transmitting the voice information,
The telephone terminal reports the selection result to the voice supplement information storage means after the voice information to be registered in the voice supplement information storage means is selected by the user based on the notification,
The voice supplement information storage means registers voice information based on a report from the wireless terminal.

The speech completion method according to claim 5,
The telephone terminal is a voice supplement method for registering voice information obtained based on a selection result after voice information to be registered in the voice supplement information storage unit is selected by a user.

The speech completion method according to any one of claims 1 to 6,
When the user's voice is input while the telephone terminal and the other telephone terminal are performing voice communication, the voice supplement information storage means stores the voice information including the voice information transmitted from the telephone terminal. Extracted from the voice information registered in the voice complementary information storage means, and sends a signal to the telephone terminal as an instruction to reproduce the same voice information as the extracted voice information,
An audio complementing method in which the telephone terminal reproduces audio information registered in advance according to the instruction.

The speech completion method according to claim 7,
The voice supplement information storage means transmits the instruction for reproducing the same voice information as the voice information extracted by the voice complement information storage means to the voice terminal when the telephone terminal transmits the voice complement information. While reproducing the audio information based on the instruction from the information storage means, an instruction to stop the transmission of the audio information is transmitted to the telephone terminal,
A voice complementing method in which the telephone terminal stops voice transmission regardless of voice input from a user while playing back voice information registered in advance according to the instruction.

In the voice complementing method for complementing voice information when a telephone terminal performs voice communication with another telephone terminal via a predetermined communication network,
Voice information from the telephone terminal is registered in advance in voice supplement information storage means connected to a predetermined communication network,
When the user's voice is input while the telephone terminal and the other telephone terminal are performing voice communication, the voice information including the voice information transmitted from the telephone terminal is registered in the voice complementary information storage unit. Extracted from the audio information,
A speech complementing method for transmitting the extracted speech information to the telephone terminal and the other telephone terminal.

The speech completion method according to any one of claims 1 to 9,
A speech complementing method using a mobile terminal device connected to a predetermined communication network as the telephone terminal.

In a speech complementing apparatus that supplements speech information when a telephone terminal performs voice communication with another telephone terminal via a predetermined communication network,
Voice complementary information storage means connected to a predetermined communication network and pre-registering voice information from the telephone terminal;
When the user's voice is input while the telephone terminal and the other telephone terminal are performing voice communication, the voice information including the voice information transmitted from the telephone terminal is registered in the voice complementary information storage unit. Voice information extraction means for extracting from the voice information that has been made,
A speech complementing device comprising speech information transmitting means for transmitting the extracted speech information to the other telephone terminal.

The speech complementing device according to claim 11,
The voice complement information storage means includes first voice information transmission means for sending voice information including voice information transmitted from the telephone terminal to the telephone terminal when the voice information including the voice information is extracted. apparatus.

The speech supplement apparatus according to claim 11 or 12,
The voice complement information storage means extracts voice information that is frequently registered from voice information transmitted from the user and automatically registers when the telephone terminal and the other telephone terminal are performing voice communication. A speech completion device having automatic information registration means.

The speech supplement device according to any one of claims 11 to 13,
The voice supplement information storage means, when extracting either voice information or voice information having a high appearance frequency, notifies the telephone terminal of voice information obtained by the extraction or information for transmitting the voice information. Voice information notification means for
A speech complementing apparatus for registering speech information based on a report transmitted when the telephone terminal selects speech information to be registered in the speech complementing information storage unit.

The speech complementation device according to any one of claims 11 to 14,
When the user's voice is input while the telephone terminal and the other telephone terminal are performing voice communication, the voice supplement information storage means stores the voice information including the voice information transmitted from the telephone terminal. A speech complementing device having a playback instruction means for extracting from the speech information registered in the speech complementation information storage means and transmitting a signal serving as an instruction for reproducing the same speech information as the extracted speech information to the telephone terminal .

The speech complementation device according to any one of claims 11 to 15,
The voice supplement information storage means transmits the instruction for reproducing the same voice information as the voice information extracted by the voice complement information storage means to the voice terminal when the telephone terminal transmits the voice complement information. A voice complementing apparatus comprising voice transmission stop instruction means for sending an instruction to stop transmission of voice information to the telephone terminal while reproducing voice information based on an instruction from the information storage means.

In a speech complementing apparatus that supplements speech information when a telephone terminal performs voice communication with another telephone terminal via a predetermined communication network,
Voice complementary information storage means connected to a predetermined communication network and pre-registering voice information from the telephone terminal;
When the user's voice is input while the telephone terminal and the other telephone terminal are performing voice communication, the voice information including the voice information transmitted from the telephone terminal is registered in the voice complementary information storage unit. Voice information extraction means for extracting from the voice information that has been made,
A speech complementing device comprising second speech information transmitting means for transmitting the extracted speech information to the telephone terminal and the other telephone terminal.

In a telephone terminal that communicates with other telephone terminals via a predetermined communication network,
The aforementioned predetermined communication network, the telephone terminal is connected to the voice complements apparatus for performing complementary audio information when performing voice communication, the aforementioned speech complementary device, audio information stored voice complementary information from the telephone terminal Voice information including voice information registered in advance in the means and transmitted from the telephone terminal is extracted from the voice information registered in the voice supplement information storage means, and the extracted voice information is transmitted to the other telephone terminal. ,
The telephone terminal, the telephone terminal having a first registration means for registering to receive audio information transmitted hand the speech complementary information stored hand stage.

The telephone terminal according to claim 18,
The telephone terminal has second registration means for receiving and registering voice information with a high appearance frequency extracted by the voice supplement information storage means from the voice supplement information storage means.

The telephone terminal according to claim 18 or 19,
The telephone terminal includes: a voice information selection unit that allows a user to select voice information to be registered in the voice supplement information storage unit based on a notification from the voice information notification unit of the voice supplement information storage unit;
A telephone terminal having reporting means for reporting the selection result to the voice complementary information storage means.

The telephone terminal according to any one of claims 18 to 20,
The telephone terminal has third registration means for registering voice information obtained based on a selection result after voice information to be registered in the voice complementary information storage means is selected by a user.

The telephone terminal according to any one of claims 18 to 21,
The telephone terminal includes a voice reproduction unit that reproduces voice information registered in advance according to the instruction content instructed by the reproduction instruction unit of the voice supplement information storage unit.

The telephone terminal according to any one of claims 18 to 22,
The telephone terminal stops the voice transmission regardless of the voice input from the user while reproducing the voice information registered in advance according to the instruction contents instructed by the transmission stopping means of the voice complementary information storing means. A telephone terminal having a transmission stop means.

The telephone terminal according to any one of claims 18 to 23,
A telephone terminal using a mobile terminal device connected to a predetermined communication network as the telephone terminal.