JP7073705B2

JP7073705B2 - Call terminal, speaker identification server, call system, call terminal processing method, speaker identification server processing method and program

Info

Publication number: JP7073705B2
Application number: JP2017242497A
Authority: JP
Inventors: 和真梅津
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2017-12-19
Filing date: 2017-12-19
Publication date: 2022-05-24
Anticipated expiration: 2037-12-19
Also published as: JP2019110450A

Description

本発明は、通話端末、話者識別サーバ、通話システム、通話端末の処理方法、話者識別サーバの処理方法及びプログラムに関する。 The present invention relates to a call terminal, a speaker identification server, a call system, a process method for a call terminal, a process method for a speaker identification server, and a program.

特許文献１には、発呼操作後かつ通話開始前に、発呼した者の生体情報を取得し、本人認証を行い、認証結果を通話相手の端末に送信する通話端末が開示されている。生体情報としては、顔、音声、指紋等が例示されている。 Patent Document 1 discloses a call terminal that acquires biometric information of the person who made the call, authenticates the person, and transmits the authentication result to the terminal of the other party after the call operation and before the start of the call. Examples of biological information include faces, voices, fingerprints, and the like.

特許文献２には、相手端末から着呼があると、無鳴動でそれをユーザに知らせることなく一時応答して相手端末から音声を受付け、この音声の特徴データと整合する通話許可音声モデルを検索し、検索できたならば鳴動音を出力してユーザに着呼を知らせる電話装置が開示されている。 In Patent Document 2, when there is an incoming call from the other party terminal, the call is received from the other party terminal by temporarily responding without ringing without notifying the user, and a call permission voice model consistent with the characteristic data of this voice is searched for. However, a telephone device that outputs a ringing sound to notify the user of an incoming call if it can be searched is disclosed.

特許文献３には、通信端末同士の通信を中継する中継装置を備えた通信システムが開示されている。通信元の通信端末は、発呼操作に応じて発呼者の認証情報を取得し、当該認証情報を含む通話要求を中継装置に送信する。中継装置は、当該認証情報に基づき発呼者を識別する。また、中継装置は、通話の宛先である電話番号に対応付けて登録されている着信拒否ユーザの中に、識別した発呼者が含まれないか判断する。そして、含まれる場合、中継装置は、通信元の通信端末に拒否通知を行う。 Patent Document 3 discloses a communication system including a relay device that relays communication between communication terminals. The communication terminal of the communication source acquires the authentication information of the caller in response to the call operation, and transmits the call request including the authentication information to the relay device. The relay device identifies the caller based on the authentication information. In addition, the relay device determines whether or not the identified caller is included in the incoming call rejection users registered in association with the telephone number that is the destination of the call. Then, if included, the relay device notifies the communication terminal of the communication source of the refusal.

特開２００４－２１７４８号公報Japanese Unexamined Patent Publication No. 2004-21748 特開２０１１－４０９４号公報Japanese Unexamined Patent Publication No. 2011-4094 特開２００９－７７１３２号公報Japanese Unexamined Patent Publication No. 2009-77132

通話中に通話相手（現在電話で話をしている相手）を識別したい場合がある。例えば、本人確認を行わなければならない場合や、声だけでは通話相手を識別できない場合等が挙げられるが、これらに限定されない。 During a call, you may want to identify the person you are talking to (the person you are currently talking to). For example, there are cases where identity verification must be performed, cases where the other party cannot be identified by voice alone, and the like, but the present invention is not limited to these cases.

特許文献１乃至３に記載の技術によれば、通話開始前に発呼操作を行った者を識別することができる。しかし、通話中に通話相手を識別することはできない。発呼操作を行った者と、通話している者とが異なる場合があり得る。 According to the techniques described in Patent Documents 1 to 3, it is possible to identify the person who made the call before the start of the call. However, it is not possible to identify the other party during a call. The person who made the call may be different from the person who is making the call.

本発明は、通話中に通話相手を識別する技術を提供することを課題とする。 An object of the present invention is to provide a technique for identifying a call partner during a call.

本発明によれば、
他の通話端末と通話する通話手段と、
通話中、生体情報を繰り返し取得する生体情報取得手段と、
通話中、前記生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を、前記話者識別情報に基づき話者を識別する話者識別サーバに繰り返し送信する第１の送信手段と、
を有する通話端末が提供される。 According to the present invention
A means of calling with other calling terminals,
Biometric information acquisition means for repeatedly acquiring biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. Means and
A telephone terminal having the above is provided.

また、本発明によれば、
通話中の通話端末から、通話中に繰り返し取得された生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を受信する受信手段と、
前記話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する話者識別手段と、
識別した話者を示す情報を、前記話者識別情報の送信元の通話端末、又は、前記話者識別情報の送信元の通話端末と通話中の通話端末に送信する送信手段と、
を有する話者識別サーバが提供される。 Further, according to the present invention,
A receiving means for receiving biometric information repeatedly acquired during a call or speaker identification information which is a feature amount extracted from the biometric information from a calling terminal during a call.
A speaker identification means for identifying a speaker based on the speaker identification information and pre-registered reference information,
A transmission means for transmitting information indicating the identified speaker to the calling terminal of the sender of the speaker identification information, or to the calling terminal of the sender of the speaker identification information and the calling terminal during a call.
A speaker identification server is provided.

また、本発明によれば、
通話端末と話者識別サーバとを有し、
前記通話端末は、
他の通話端末と通話する通話手段と、
通話中、生体情報を繰り返し取得する生体情報取得手段と、
通話中、前記生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を前記話者識別サーバに繰り返し送信する第１の送信手段と、
を有し、
前記話者識別サーバは、
通話中の前記通話端末から、前記話者識別情報を受信する受信手段と、
前記話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する話者識別手段と、
識別した話者を示す情報を、前記話者識別情報の送信元の通話端末、又は、前記話者識別情報の送信元の通話端末と通話中の通話端末に送信する送信手段と、
を有する通話システムが提供される。 Further, according to the present invention,
It has a call terminal and a speaker identification server,
The call terminal is
A means of calling with other calling terminals,
Biometric information acquisition means for repeatedly acquiring biometric information during a call,
A first transmission means for repeatedly transmitting the biological information or speaker identification information, which is a feature amount extracted from the biological information, to the speaker identification server during a call.
Have,
The speaker identification server is
A receiving means for receiving the speaker identification information from the calling terminal during a call, and
A speaker identification means for identifying a speaker based on the speaker identification information and pre-registered reference information,
A transmission means for transmitting information indicating the identified speaker to the calling terminal of the sender of the speaker identification information, or to the calling terminal of the sender of the speaker identification information and the calling terminal during a call.
A calling system with is provided.

また、本発明によれば、
コンピュータが、
他の通話端末と通話する通話工程と、
通話中、生体情報を繰り返し取得する生体情報取得工程と、
通話中、前記生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を、前記話者識別情報に基づき話者を識別する話者識別サーバに繰り返し送信する第１の送信工程と、
を実行する通話端末の処理方法が提供される。 Further, according to the present invention,
The computer
The call process of talking to other call terminals and
The biometric information acquisition process, which repeatedly acquires biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. Process and
Is provided with a processing method for the calling terminal to execute.

また、本発明によれば、
コンピュータを、
他の通話端末と通話する通話手段、
通話中、生体情報を繰り返し取得する生体情報取得手段、
通話中、前記生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を、前記話者識別情報に基づき話者を識別する話者識別サーバに繰り返し送信する第１の送信手段、
として機能させるプログラムが提供される。 Further, according to the present invention,
Computer,
A means of calling to talk to other calling terminals,
Biometric information acquisition means that repeatedly acquires biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. means,
A program is provided that functions as.

また、本発明によれば、
コンピュータが、
通話中の通話端末から、通話中に繰り返し取得された生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を受信する受信工程と、
前記話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する話者識別工程と、
識別した話者を示す情報を、前記話者識別情報の送信元の通話端末、又は、前記話者識別情報の送信元の通話端末と通話中の通話端末に送信する送信工程と、
を実行する話者識別サーバの処理方法が提供される。 Further, according to the present invention,
The computer
A receiving process of receiving biometric information repeatedly acquired during a call or speaker identification information which is a feature amount extracted from the biometric information from a calling terminal during a call.
A speaker identification process for identifying a speaker based on the speaker identification information and pre-registered reference information,
A transmission step of transmitting information indicating the identified speaker to the call terminal of the sender of the speaker identification information or the call terminal of the sender of the speaker identification information and the call terminal in conversation.
The processing method of the speaker identification server that executes the above is provided.

また、本発明によれば、
コンピュータを、
通話中の通話端末から、通話中に繰り返し取得された生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を受信する受信手段、
前記話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する話者識別手段、
識別した話者を示す情報を、前記話者識別情報の送信元の通話端末、又は、前記話者識別情報の送信元の通話端末と通話中の通話端末に送信する送信手段、
として機能させるプログラムが提供される。 Further, according to the present invention,
Computer,
A receiving means for receiving biometric information repeatedly acquired during a call or speaker identification information which is a feature amount extracted from the biometric information from a calling terminal during a call.
A speaker identification means for identifying a speaker based on the speaker identification information and the reference information registered in advance.
A transmission means for transmitting information indicating an identified speaker to a calling terminal that is a source of the speaker identification information, or a calling terminal that is in a call with a calling terminal that is a source of the speaker identification information.
A program is provided that functions as.

本発明によれば、通話中に通話相手を識別できるようになる。 According to the present invention, the other party can be identified during a call.

本実施形態の装置のハードウエア構成の一例を示す図である。It is a figure which shows an example of the hardware composition of the apparatus of this embodiment. 本実施形態の通話端末１０の機能ブロック図の一例を示す図である。It is a figure which shows an example of the functional block diagram of the call terminal 10 of this embodiment. 本実施形態の通話端末１０の処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the processing flow of the call terminal 10 of this embodiment. 本実施形態の話者識別サーバ２０の機能ブロック図の一例を示す図である。It is a figure which shows an example of the functional block diagram of the speaker identification server 20 of this embodiment. 本実施形態の話者識別サーバ２０が処理する情報の一例を模式的に示す図である。It is a figure which shows an example of the information processed by the speaker identification server 20 of this embodiment schematically. 本実施形態の話者識別サーバ２０の処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the process flow of the speaker identification server 20 of this embodiment. 本実施形態の通話端末１０の機能ブロック図の一例を示す図である。It is a figure which shows an example of the functional block diagram of the call terminal 10 of this embodiment. 本実施形態の通話端末１０の処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the processing flow of the call terminal 10 of this embodiment. 本実施形態の話者識別サーバ２０の処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the process flow of the speaker identification server 20 of this embodiment. 本実施形態の通話端末１０の機能ブロック図の一例を示す図である。It is a figure which shows an example of the functional block diagram of the call terminal 10 of this embodiment. 本実施形態の話者識別サーバ２０の機能ブロック図の一例を示す図である。It is a figure which shows an example of the functional block diagram of the speaker identification server 20 of this embodiment. 本実施形態の話者識別サーバ２０に登録される情報の一例を模式的に示す図である。It is a figure which shows an example of the information registered in the speaker identification server 20 of this embodiment schematically.

＜第１の実施形態＞
まず、本実施形態の通話システムの概要を説明する。通話システムは、通話端末と、話者識別サーバとを有する。 <First Embodiment>
First, an outline of the call system of the present embodiment will be described. The call system has a call terminal and a speaker identification server.

通話端末は、他の通話端末と通話する手段と、通話中に生体情報を繰り返し取得する手段と、通話中に生体情報、又は、当該生体情報から抽出された特徴量である話者識別情報を話者識別サーバに繰り返し送信する手段とを有する。 The call terminal has a means for talking to another call terminal, a means for repeatedly acquiring biometric information during a call, biometric information during a call, or speaker identification information which is a feature amount extracted from the biometric information. It has a means for repeatedly transmitting to a speaker identification server.

話者識別サーバは、通話中の通話端末から話者識別情報を受信する手段と、話者識別情報及び予め登録されている参照情報に基づき話者を識別する手段と、識別した話者を示す情報を、話者識別情報の送信元の通話端末と通話中の通話端末に送信する手段とを有する。 The speaker identification server indicates a means for receiving speaker identification information from a calling terminal during a call, a means for identifying a speaker based on the speaker identification information and pre-registered reference information, and an identified speaker. It has a means for transmitting information to a calling terminal from which the speaker identification information is transmitted and a calling terminal during a call.

このような本実施形態の通話システムによれば、通話中に話者を識別できるようになる。 According to the call system of the present embodiment as described above, the speaker can be identified during the call.

次に、通話端末及び話者識別サーバの構成を詳細に説明する。まず、通話端末及び話者識別サーバのハードウエア構成の一例について説明する。本実施形態の通話端末及び話者識別サーバが備える各機能は、任意のコンピュータのＣＰＵ（Central Processing Unit）、メモリ、メモリにロードされるプログラム、そのプログラムを格納するハードディスク等の記憶ユニット（あらかじめ装置を出荷する段階から格納されているプログラムのほか、ＣＤ（Compact Disc）等の記憶媒体やインターネット上のサーバ等からダウンロードされたプログラムをも格納できる）、ネットワーク接続用インターフェイスを中心にハードウエアとソフトウエアの任意の組合せによって実現される。そして、その実現方法、装置にはいろいろな変形例があることは、当業者には理解されるところである。 Next, the configurations of the call terminal and the speaker identification server will be described in detail. First, an example of the hardware configuration of the call terminal and the speaker identification server will be described. Each function of the call terminal and the speaker identification server of the present embodiment includes a CPU (Central Processing Unit) of an arbitrary computer, a memory, a program loaded into the memory, and a storage unit (preliminary device) such as a hard disk for storing the program. In addition to the programs stored from the stage of shipping, it can also store programs downloaded from storage media such as CDs (Compact Discs) and servers on the Internet), hardware and software centered on network connection interfaces. It is realized by any combination of wear. And, it is understood by those skilled in the art that there are various variations in the method of realizing the device and the device.

図１は、本実施形態の通話端末及び話者識別サーバ各々のハードウエア構成を例示するブロック図である。図１に示すように、通話端末及び話者識別サーバ各々は、プロセッサ１Ａ、メモリ２Ａ、入出力インターフェイス３Ａ、周辺回路４Ａ、バス５Ａを有する。周辺回路４Ａには、様々なモジュールが含まれる。通話端末及び話者識別サーバは周辺回路４Ａを有さなくてもよい。 FIG. 1 is a block diagram illustrating a hardware configuration of each of the call terminal and the speaker identification server of the present embodiment. As shown in FIG. 1, each of the call terminal and the speaker identification server has a processor 1A, a memory 2A, an input / output interface 3A, a peripheral circuit 4A, and a bus 5A. The peripheral circuit 4A includes various modules. The call terminal and the speaker identification server do not have to have the peripheral circuit 4A.

バス５Ａは、プロセッサ１Ａ、メモリ２Ａ、周辺回路４Ａ及び入出力インターフェイス３Ａが相互にデータを送受信するためのデータ伝送路である。プロセッサ１Ａは、例えばＣＰＵ（Central Processing Unit）やＧＰＵ（Graphics Processing Unit）などの演算処理装置である。メモリ２Ａは、例えばＲＡＭ（Random Access Memory）やＲＯＭ（Read Only Memory）などのメモリである。入出力インターフェイス３Ａは、入力装置（例：キーボード、マウス、マイク等）、外部装置、外部サーバ、外部センサ等から情報を取得するためのインターフェイスや、出力装置（例：ディスプレイ、スピーカ、プリンター、メーラ等）、外部装置、外部サーバ等に情報を出力するためのインターフェイスなどを含む。プロセッサ１Ａは、各モジュールに指令を出し、それらの演算結果をもとに演算を行うことができる。 The bus 5A is a data transmission path for the processor 1A, the memory 2A, the peripheral circuit 4A, and the input / output interface 3A to transmit and receive data to each other. The processor 1A is, for example, an arithmetic processing unit such as a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit). The memory 2A is, for example, a memory such as a RAM (Random Access Memory) or a ROM (Read Only Memory). The input / output interface 3A is an interface for acquiring information from an input device (example: keyboard, mouse, microphone, etc.), an external device, an external server, an external sensor, etc., and an output device (example: display, speaker, printer, mailer, etc.). Etc.), including interfaces for outputting information to external devices, external servers, etc. The processor 1A can issue a command to each module and perform a calculation based on the calculation result thereof.

次に、通話端末及び話者識別サーバの機能構成について説明する。まず、通話端末の機能構成を説明する。図２の機能ブロック図に示すように、通話端末１０は、通話部１１と、生体情報取得部１２と、第１の送信部１３とを有する。 Next, the functional configuration of the call terminal and the speaker identification server will be described. First, the functional configuration of the telephone terminal will be described. As shown in the functional block diagram of FIG. 2, the telephone terminal 10 has a telephone unit 11, a biological information acquisition unit 12, and a first transmission unit 13.

通話部１１は、他の通話端末と通話する。すなわち、通話部１１は、ユーザ操作に応じて、発呼処理及び着呼処理を行う。また、発信側通話端末の発呼処理、及び、着信側通話端末の着呼処理により通話が確立されると、通話部１１は、マイクを介して入力された音声データを通話相手の通話端末に送信する。また、通話部１１は、通話相手の通話端末から音声データを受信し、当該音声データを処理してスピーカから音を出力させる。通話部１１による通話手段は特段制限されないが、例えばＳＩＰ（session initiation protocol）等を利用することができる。 The call unit 11 makes a call with another call terminal. That is, the calling unit 11 performs the calling process and the incoming call process according to the user operation. Further, when the call is established by the outgoing call processing of the calling side calling terminal and the incoming call processing of the called side calling terminal, the calling unit 11 transfers the voice data input via the microphone to the calling terminal of the other party. Send. Further, the call unit 11 receives voice data from the call terminal of the other party, processes the voice data, and outputs sound from the speaker. The means of communication by the communication unit 11 is not particularly limited, but for example, SIP (session initiation protocol) or the like can be used.

生体情報取得部１２は、通話中（発呼処理及びそれに対する着呼処理により通話が確立されている間。以下同様。）、生体情報を繰り返し取得する。 The biometric information acquisition unit 12 repeatedly acquires biometric information during a call (while the call is established by the calling process and the incoming call processing for the call processing; the same applies hereinafter).

生体情報取得部１２は、所定時間毎に繰り返し生体情報を取得してもよい。その他、生体情報取得部１２は、通話開始後に少なくとも１回生体情報を取得し、その後、話者が変更したと判断される毎に生体情報を取得してもよい。話者変更の検出は、例えばマイクを介して入力される話者の声を利用した声紋認証により実現できる。 The biological information acquisition unit 12 may repeatedly acquire biological information at predetermined time intervals. In addition, the biometric information acquisition unit 12 may acquire biometric information at least once after the start of a call, and then acquire biometric information each time it is determined that the speaker has changed. The detection of the speaker change can be realized by voiceprint authentication using the speaker's voice input through the microphone, for example.

生体情報取得部１２は、生体情報として、体の一部（例：顔）の画像、体（例：耳の中）で反射した反射音（耳認証で用いる情報）、指紋又は声を取得する手段を有する。なお、生体情報取得部１２は、その他の生体情報を取得する手段を有してもよい。 The biological information acquisition unit 12 acquires an image of a part of the body (eg, face), a reflected sound (information used for ear authentication) reflected by the body (eg, in the ear), a fingerprint, or a voice as biological information. Have means. The biological information acquisition unit 12 may have other means for acquiring biological information.

例えば、通話端末１０はカメラを有してもよい。そして、当該カメラで話者の体の一部（例：顔）を撮影してもよい。その他、通話端末１０にマイク一体型イヤホンが取り付けられてもよい。そして、イヤホンから音を出力し、話者の耳の中で反射した音をマイクで収集してもよい。 For example, the telephone terminal 10 may have a camera. Then, a part of the speaker's body (eg, face) may be photographed with the camera. In addition, a microphone-integrated earphone may be attached to the call terminal 10. Then, the sound may be output from the earphone and the sound reflected in the speaker's ear may be collected by the microphone.

その他、通話端末１０は、指紋センサを有してもよい。そして、当該指紋センサで話者の指紋を収集してもよい。指紋センサは通話端末１０の持ち手部分に設けられてもよい。このようにすれば、通話中に話者に意識させることなく話者の指紋を収集できる。その他、通話端末１０は、マイクを有してもよい。そして、当該マイクで話者の声を収集してもよい。なお、通話中、話者の声が継続的にマイクを介して入力され、その音声データが通話相手の通話端末に送信される。生体情報取得部１２は、当該音声データ（話者の声）を生体情報として取得することができる。 In addition, the telephone terminal 10 may have a fingerprint sensor. Then, the fingerprint sensor may collect the fingerprint of the speaker. The fingerprint sensor may be provided on the handle portion of the call terminal 10. In this way, the fingerprint of the speaker can be collected without making the speaker aware during the call. In addition, the telephone terminal 10 may have a microphone. Then, the voice of the speaker may be collected by the microphone. During a call, the voice of the speaker is continuously input through the microphone, and the voice data is transmitted to the call terminal of the other party. The biological information acquisition unit 12 can acquire the voice data (speaker's voice) as biological information.

以降、通話端末１０と通話中の通話端末を「通話相手端末」という。 Hereinafter, the calling terminal that is in a call with the calling terminal 10 is referred to as a "calling party terminal".

第１の送信部１３は、通話中、話者識別情報を話者識別サーバ２０に繰り返し送信する。話者識別情報は、生体情報取得部１２により取得された生体情報、又は、当該生体情報から抽出された特徴量である。第１の送信部１３は、生体情報取得部１２により生体情報を取得される毎に、話者識別情報を話者識別サーバ２０に送信してもよい。 The first transmission unit 13 repeatedly transmits the speaker identification information to the speaker identification server 20 during a call. The speaker identification information is the biological information acquired by the biological information acquisition unit 12, or the feature amount extracted from the biological information. The first transmission unit 13 may transmit the speaker identification information to the speaker identification server 20 each time the biometric information is acquired by the biometric information acquisition unit 12.

また、第１の送信部１３は、通話相手端末の通信アドレス（例：ＩＰアドレス）を話者識別サーバ２０に送信してもよい。 Further, the first transmission unit 13 may transmit the communication address (eg, IP address) of the other party's terminal to the speaker identification server 20.

次に、図３のフローチャートを用いて、通話端末１０の処理の流れの一例を説明する。図３の処理は、他の通話端末との間の通話の確立に応じて行われる。 Next, an example of the processing flow of the call terminal 10 will be described with reference to the flowchart of FIG. The process of FIG. 3 is performed in response to the establishment of a call with another call terminal.

通話開始後、生体情報取得タイミングになると（Ｓ１０のＹｅｓ）、生体情報取得部１２は話者の生体情報を取得する（Ｓ１１）。そして、第１の送信部１３は、話者識別情報を話者識別サーバ２０に送信する（Ｓ１２）。話者識別情報は、Ｓ１１で取得された生体情報、又は、当該生体情報から抽出された特徴量である。 After the start of the call, when the biometric information acquisition timing comes (Yes in S10), the biometric information acquisition unit 12 acquires the biometric information of the speaker (S11). Then, the first transmission unit 13 transmits the speaker identification information to the speaker identification server 20 (S12). The speaker identification information is the biological information acquired in S11 or the feature amount extracted from the biological information.

通話が終了していない間（Ｓ１３のＮｏ）、生体情報取得タイミング待ちとなる（Ｓ１０）。そして、生体情報取得タイミングになると（Ｓ１０のＹｅｓ）、同様の処理を行う。また、通話が終了すると（Ｓ１３のＹｅｓ）、当該処理を終了する。 While the call is not completed (No in S13), it waits for the biometric information acquisition timing (S10). Then, when the biometric information acquisition timing comes (Yes in S10), the same processing is performed. When the call ends (Yes in S13), the process ends.

次に、話者識別サーバの機能構成を説明する。図４の機能ブロック図に示すように、話者識別サーバ２０は、受信部２１と、話者識別部２２と、送信部２３とを有する。 Next, the functional configuration of the speaker identification server will be described. As shown in the functional block diagram of FIG. 4, the speaker identification server 20 has a receiving unit 21, a speaker identification unit 22, and a transmitting unit 23.

受信部２１は、通話中の通話端末１０から、通話中に繰り返し取得された生体情報、又は、当該生体情報から抽出された特徴量である話者識別情報を受信する。 The receiving unit 21 receives the biometric information repeatedly acquired during the call or the speaker identification information which is a feature amount extracted from the biometric information from the call terminal 10 during the call.

話者識別部２２は、受信部２１により受信された話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する。図５に参照情報の一例を模式的に示す。図示する参照情報は、ユーザの氏名と、話者識別情報とが対応付けて登録されている。なお、ユーザの属性（年齢、性別、会社名、部署名、肩書等）等、その他の情報が登録されてもよい。 The speaker identification unit 22 identifies the speaker based on the speaker identification information received by the reception unit 21 and the reference information registered in advance. FIG. 5 schematically shows an example of reference information. In the illustrated reference information, the user's name and the speaker identification information are registered in association with each other. Other information such as user attributes (age, gender, company name, department name, title, etc.) may be registered.

話者識別部２２は、受信部２１により受信された話者識別情報と、参照情報に含まれる話者識別情報とを用いた照合処理により、話者を識別することができる。 The speaker identification unit 22 can identify the speaker by collation processing using the speaker identification information received by the reception unit 21 and the speaker identification information included in the reference information.

送信部２３は、識別した話者を示す情報を、話者識別情報の送信元の通話端末１０と通話中の通話相手端末に送信する。送信される情報は、話者の氏名のほか、話者の属性が含まれてもよい。話者識別サーバ２０は、例えば通話端末１０から、通話相手端末の通信アドレス（例：ＩＰアドレス）を取得することができる。そして、それを利用して通話相手端末への情報の送信を実現できる。 The transmission unit 23 transmits information indicating the identified speaker to the call terminal 10 from which the speaker identification information is transmitted and the other party's terminal during a call. The information to be transmitted may include the speaker's name as well as the speaker's attributes. The speaker identification server 20 can acquire the communication address (eg, IP address) of the other party's terminal from, for example, the call terminal 10. Then, it is possible to realize the transmission of information to the other party's terminal by using it.

次に、図６のフローチャートを用いて、話者識別サーバ２０の処理の流れの一例を説明する。図６の処理は、話者識別情報の受信に応じて行われる。 Next, an example of the processing flow of the speaker identification server 20 will be described with reference to the flowchart of FIG. The process of FIG. 6 is performed in response to the reception of the speaker identification information.

Ｓ２０では、話者識別部２２は、受信部２１により受信された話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する。 In S20, the speaker identification unit 22 identifies the speaker based on the speaker identification information received by the reception unit 21 and the reference information registered in advance.

Ｓ２１では、送信部２３は、識別した話者を示す情報を通話相手端末に送信する。話者識別サーバ２０から話者を示す情報を受信した通話相手端末は、その情報を任意の出力手段（例：ディスプレイ、スピーカ等）を介して出力することができる。出力された情報に基づき、通話相手端末のユーザは、通話端末１０の話者を識別できる。 In S21, the transmission unit 23 transmits information indicating the identified speaker to the other party's terminal. The call partner terminal that has received the information indicating the speaker from the speaker identification server 20 can output the information via an arbitrary output means (eg, display, speaker, etc.). Based on the output information, the user of the other party's terminal can identify the speaker of the call terminal 10.

なお、通話相手端末も、通話端末１０と同様の機能を備えることができる。かかる場合、通話端末１０のユーザは、話者識別サーバ２０からの情報に基づき、通話相手端末の話者を識別できる。 The other party terminal can also have the same function as the call terminal 10. In such a case, the user of the call terminal 10 can identify the speaker of the other party terminal based on the information from the speaker identification server 20.

以上説明した本実施形態の通話システムによれば、ユーザは、通話中に通話相手（現在電話で話をしている相手）を識別することができる。 According to the calling system of the present embodiment described above, the user can identify the other party (the other party currently talking on the phone) during the call.

例えば、契約内容の変更、契約内容の確認等を電話で行う場合、通話相手が契約者本人であるかの確認が求められる。本実施形態の通話システムによれば、本人確認を容易かつ高精度に行うことができる。 For example, when changing the contract contents or confirming the contract contents by telephone, it is required to confirm whether the other party is the contractor. According to the call system of the present embodiment, identity verification can be performed easily and with high accuracy.

また、声だけでの通話相手の識別が困難な場合であっても、本実施形態の通話システムによれば、通話中に容易かつ高精度に通話相手を識別することができる。 Further, even when it is difficult to identify the other party by voice alone, the call system of the present embodiment can easily and highly accurately identify the other party during a call.

また、声だけでの通話相手の識別が困難なことを利用した詐欺行為が存在するが、本実施形態の通話システムによれば、当該詐欺行為の抑制効果が期待される。 Further, there is a fraudulent act utilizing the fact that it is difficult to identify the call partner only by voice, but according to the call system of the present embodiment, the effect of suppressing the fraudulent act is expected.

また、本実施形態の通話システムによれば、電話している相手を識別できるので、通話相手を誤認して秘密の情報等を部外者に話してしまう不都合等を回避できる。結果、セキュリティーが向上する。 Further, according to the calling system of the present embodiment, since the calling party can be identified, it is possible to avoid the inconvenience of misidentifying the calling party and telling secret information or the like to an outsider. As a result, security is improved.

また、本実施形態の通話システムによれば、電話している相手を識別できるので、電話の取次ぎの間違いが減るほか、電話の取次業務が簡易化される。 Further, according to the call system of the present embodiment, since the telephone party can be identified, the mistake of the telephone agency is reduced and the telephone agency work is simplified.

また、本実施形態の通話システムでは、通話中に話者の生体情報の取得を繰り返し行い、当該生体情報に基づく話者識別を繰り返し行うことができる。このため、例えば通話端末１０の話者が通話中に変わった場合、通話相手端末のユーザは当該話者の変更を認識することができる。そして、通話相手端末のユーザは、変更後の話者を識別することができる。 Further, in the call system of the present embodiment, it is possible to repeatedly acquire the biological information of the speaker during the call and repeatedly identify the speaker based on the biological information. Therefore, for example, when the speaker of the call terminal 10 changes during a call, the user of the other party's terminal can recognize the change of the speaker. Then, the user of the other party terminal can identify the speaker after the change.

ここで、変形例を説明する。通話システムは、発呼操作後かつ通話開始前に、上記話者識別処理を行ってもよい。すなわち、通話端末１０の生体情報取得部１２は、発呼操作後かつ通話開始前に、生体情報を取得してもよい。そして、通話端末１０の第１の送信部１３は、発呼操作後かつ通話開始前に、話者識別情報を話者識別サーバ２０に送信してもよい。話者識別サーバ２０の受信部２１は、発呼操作後かつ通話開始前に通話端末１０から話者識別情報を受信してもよい。そして、話者識別サーバ２０の送信部２３は、識別した話者を示す情報を、話者識別情報の送信元の通話端末１０、又は、話者識別情報の送信元の通話端末１０の発呼先の通話端末に送信してもよい。 Here, a modified example will be described. The call system may perform the speaker identification process after the call is made and before the call is started. That is, the biometric information acquisition unit 12 of the call terminal 10 may acquire biometric information after the call call operation and before the start of the call. Then, the first transmission unit 13 of the call terminal 10 may transmit the speaker identification information to the speaker identification server 20 after the call making operation and before the start of the call. The receiving unit 21 of the speaker identification server 20 may receive the speaker identification information from the call terminal 10 after the call making operation and before the start of the call. Then, the transmission unit 23 of the speaker identification server 20 calls the information indicating the identified speaker to the call terminal 10 of the sender of the speaker identification information or the call terminal 10 of the sender of the speaker identification information. It may be sent to the previous calling terminal.

当該変形例によれば、ユーザは、通話中に通話相手（現在電話で話をしている相手）を識別するのみならず、通話開始前に発呼元の相手を識別することができる。 According to the modification, the user can not only identify the other party (the other party currently talking on the phone) during the call, but also identify the caller party before the start of the call.

＜第２の実施形態＞
本実施形態の通話システムは、話者識別サーバ２０により生成された話者を示す情報が通話相手端末に届けられるまでのルートが第１の実施形態と異なる。具体的には、話者識別サーバ２０は、話者識別情報を通話端末１０に返信する。そして、通話端末１０が話者識別サーバ２０から受信した話者を示す情報を通話相手端末に送信する。その他の構成は第１の実施形態と同様である。 <Second embodiment>
In the call system of the present embodiment, the route until the information indicating the speaker generated by the speaker identification server 20 is delivered to the other party terminal is different from that of the first embodiment. Specifically, the speaker identification server 20 returns the speaker identification information to the call terminal 10. Then, the call terminal 10 transmits the information indicating the speaker received from the speaker identification server 20 to the other party terminal. Other configurations are the same as those of the first embodiment.

通話端末１０及び話者識別サーバ２０のハードウエア構成の一例は、第１の実施形態と同様である。 An example of the hardware configuration of the call terminal 10 and the speaker identification server 20 is the same as that of the first embodiment.

まず、通話端末１０の機能構成を説明する。図７の機能ブロック図に示すように、通話端末１０は、通話部１１と、生体情報取得部１２と、第１の送信部１３と、受信部１４と、第２の送信部１５とを有する。通話部１１、生体情報取得部１２及び第１の送信部１３の機能構成は、第１の実施形態と同様である。 First, the functional configuration of the call terminal 10 will be described. As shown in the functional block diagram of FIG. 7, the telephone terminal 10 has a telephone unit 11, a biological information acquisition unit 12, a first transmission unit 13, a reception unit 14, and a second transmission unit 15. .. The functional configurations of the call unit 11, the biometric information acquisition unit 12, and the first transmission unit 13 are the same as those in the first embodiment.

受信部１４は、話者識別サーバ２０から話者を示す情報を受信する。話者を示す情報は、第１の送信部１３が話者識別サーバ２０に送信した話者識別情報に基づき、話者識別サーバ２０にて識別された話者を示す情報である。 The receiving unit 14 receives information indicating a speaker from the speaker identification server 20. The information indicating the speaker is information indicating the speaker identified by the speaker identification server 20 based on the speaker identification information transmitted by the first transmission unit 13 to the speaker identification server 20.

第２の送信部１５は、受信部１４により受信された話者を示す情報を、通話相手端末に送信する。 The second transmitting unit 15 transmits information indicating the speaker received by the receiving unit 14 to the other party terminal.

次に、図８のフローチャートを用いて、通話端末１０の処理の流れの一例を説明する。図８の処理は、他の通話端末との間の通話の確立に応じて行われる。 Next, an example of the processing flow of the call terminal 10 will be described with reference to the flowchart of FIG. The process of FIG. 8 is performed in response to the establishment of a call with another call terminal.

通話開始後、生体情報取得タイミングになると（Ｓ３０のＹｅｓ）、生体情報取得部１２は話者の生体情報を取得する（Ｓ３１）。そして、第１の送信部１３は、話者識別情報を話者識別サーバ２０に送信する（Ｓ３２）。話者識別情報は、Ｓ１１で取得された生体情報、又は、当該生体情報から抽出された特徴量である。 After the start of the call, when the biometric information acquisition timing comes (Yes in S30), the biometric information acquisition unit 12 acquires the biometric information of the speaker (S31). Then, the first transmission unit 13 transmits the speaker identification information to the speaker identification server 20 (S32). The speaker identification information is the biological information acquired in S11 or the feature amount extracted from the biological information.

その後、Ｓ３２での話者識別情報の送信に応じて話者識別サーバ２０から返信されてきた話者を示す情報を、受信部１４が受信する（Ｓ３３）。そして、第２の送信部１５は、Ｓ３３で受信された話者を示す情報を通話相手端末に送信する。通話端末１０から話者を示す情報を受信した通話相手端末は、その情報を任意の出力手段（例：ディスプレイ、スピーカ等）を介して出力することができる。出力された情報に基づき、通話相手端末のユーザは、通話端末１０の話者を識別できる。 After that, the receiving unit 14 receives the information indicating the speaker returned from the speaker identification server 20 in response to the transmission of the speaker identification information in S32 (S33). Then, the second transmission unit 15 transmits the information indicating the speaker received in S33 to the other party terminal. The call partner terminal that has received the information indicating the speaker from the call terminal 10 can output the information via any output means (eg, display, speaker, etc.). Based on the output information, the user of the other party's terminal can identify the speaker of the call terminal 10.

通話が終了していない間（Ｓ３５のＮｏ）、生体情報取得タイミング待ちとなる（Ｓ３０）。そして、生体情報取得タイミングになると（Ｓ３０のＹｅｓ）、同様の処理を行う。また、通話が終了すると（Ｓ３５のＹｅｓ）、当該処理を終了する。 While the call is not completed (No in S35), it waits for the biometric information acquisition timing (S30). Then, when the biometric information acquisition timing comes (Yes in S30), the same processing is performed. When the call ends (Yes in S35), the process ends.

次に、話者識別サーバの機能構成を説明する。図４の機能ブロック図に示すように、話者識別サーバ２０は、受信部２１と、話者識別部２２と、送信部２３とを有する。受信部２１及び話者識別部２２の機能構成は、第１の実施形態と同様である。 Next, the functional configuration of the speaker identification server will be described. As shown in the functional block diagram of FIG. 4, the speaker identification server 20 has a receiving unit 21, a speaker identification unit 22, and a transmitting unit 23. The functional configuration of the receiving unit 21 and the speaker identification unit 22 is the same as that of the first embodiment.

送信部２３は、話者識別部２２により識別された話者を示す情報を、話者識別情報の送信元の通話端末１０に送信する。 The transmission unit 23 transmits information indicating the speaker identified by the speaker identification unit 22 to the call terminal 10 from which the speaker identification information is transmitted.

次に、図９のフローチャートを用いて、話者識別サーバ２０の処理の流れの一例を説明する。図９の処理は、話者識別情報の受信に応じて行われる。 Next, an example of the processing flow of the speaker identification server 20 will be described with reference to the flowchart of FIG. The process of FIG. 9 is performed in response to the reception of the speaker identification information.

Ｓ４０では、話者識別部２２は、受信部２１により受信された話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する。 In S40, the speaker identification unit 22 identifies the speaker based on the speaker identification information received by the reception unit 21 and the reference information registered in advance.

Ｓ４１では、送信部２３は、識別した話者を示す情報を、話者識別情報の送信元である通話端末１０に送信する。 In S41, the transmission unit 23 transmits the information indicating the identified speaker to the call terminal 10 which is the transmission source of the speaker identification information.

以上説明した本実施形態の通話システムによれば、第１の実施形態と同様な作用効果が実現される。 According to the communication system of the present embodiment described above, the same operation and effect as those of the first embodiment are realized.

＜第３の実施形態＞
本実施形態の通話システムでは、通話端末１０は複数種類の生体情報を取得する手段を有し、周囲環境に基づき話者識別サーバ２０に送信する生体情報の種類を変更できる点で、第１及び第２の実施形態と異なる。その他の構成は第１及び第２の実施形態と同様である。 <Third embodiment>
In the call system of the present embodiment, the call terminal 10 has means for acquiring a plurality of types of biometric information, and the type of biometric information transmitted to the speaker identification server 20 can be changed based on the surrounding environment. It is different from the second embodiment. Other configurations are the same as those of the first and second embodiments.

通話端末１０及び話者識別サーバ２０のハードウエア構成の一例は、第１及び第２の実施形態と同様である。 An example of the hardware configuration of the call terminal 10 and the speaker identification server 20 is the same as in the first and second embodiments.

まず、通話端末１０の機能構成を説明する。図１０の機能ブロック図に示すように、通話端末１０は、通話部１１と、生体情報取得部１２と、第１の送信部１３と、選択部１６とを有する。通話端末１０は、さらに受信部１４と、第２の送信部１５とを有してもよい。通話部１１、受信部１４及び第２の送信部１５の機能構成は、第１及び第２の実施形態と同様である。 First, the functional configuration of the call terminal 10 will be described. As shown in the functional block diagram of FIG. 10, the telephone terminal 10 includes a telephone unit 11, a biological information acquisition unit 12, a first transmission unit 13, and a selection unit 16. The calling terminal 10 may further have a receiving unit 14 and a second transmitting unit 15. The functional configurations of the calling unit 11, the receiving unit 14, and the second transmitting unit 15 are the same as those of the first and second embodiments.

生体情報取得部１２は、体の一部（例：顔）の画像、体（例：耳の中）で反射した反射音、指紋又は声の中の少なくとも２つを取得する手段を有する。生体情報取得部１２のその他の機能構成は、第１及び第２の実施形態と同様である。 The biological information acquisition unit 12 has a means for acquiring at least two of an image of a part of the body (eg, face), a reflected sound reflected by the body (eg, in the ear), a fingerprint, or a voice. Other functional configurations of the biological information acquisition unit 12 are the same as those of the first and second embodiments.

選択部１６は、通話端末１０の周囲環境に基づき、話者識別サーバ２０に送信する生体情報の種類を選択する。 The selection unit 16 selects the type of biometric information to be transmitted to the speaker identification server 20 based on the surrounding environment of the call terminal 10.

例えば、選択部１６は、通話端末１０の周囲が所定レベルより暗い場合、体で反射した反射音又は声を選択してもよい。この場合、通話端末１０は照度センサを備えてもよい。そして、選択部１６は照度センサの検出結果に基づき、通話端末１０の周囲が所定レベルより暗いか否かを判断できる。 For example, the selection unit 16 may select a reflected sound or voice reflected by the body when the surroundings of the telephone terminal 10 are darker than a predetermined level. In this case, the telephone terminal 10 may include an illuminance sensor. Then, the selection unit 16 can determine whether or not the surroundings of the telephone terminal 10 are darker than a predetermined level based on the detection result of the illuminance sensor.

また、選択部１６は、通話端末１０の周囲の騒音レベルが閾値以上である場合、体の一部の画像または指紋を選択してもよい。選択部１６は、通話端末１０のマイクを介して入力された音に基づき、通話端末１０の周囲の騒音レベルが閾値以上か否かを判断できる。 Further, the selection unit 16 may select an image or a fingerprint of a part of the body when the noise level around the telephone terminal 10 is equal to or higher than the threshold value. The selection unit 16 can determine whether or not the noise level around the call terminal 10 is equal to or higher than the threshold value based on the sound input through the microphone of the call terminal 10.

第１の送信部１３は、選択部１６により選択された種類の生体情報、又は、選択部１６により選択された種類の生体情報から抽出された特徴量である話者識別情報を話者識別サーバ２０に送信する。 The first transmission unit 13 is a speaker identification server that uses the speaker identification information, which is a feature amount extracted from the biological information of the type selected by the selection unit 16 or the biological information of the type selected by the selection unit 16. Send to 20.

話者識別サーバ２０は、ユーザの氏名と、複数種類の話者識別情報とを対応付けた参照情報を記憶し、通話端末１０から受信した種類の話者識別情報を用いて話者識別を行う。話者識別サーバ２０のその他の機能構成は、第１及び第２の実施形態と同様である。 The speaker identification server 20 stores reference information in which a user's name is associated with a plurality of types of speaker identification information, and identifies the speaker using the type of speaker identification information received from the call terminal 10. .. Other functional configurations of the speaker identification server 20 are the same as those of the first and second embodiments.

以上説明した本実施形態の通話システムによれば、第１及び第２の実施形態と同様な作用効果が実現される。また、本実施形態の通話システムによれば、通話端末１０の周囲環境に基づき、話者認証に用いる生体情報を選択することができる。例えば、周囲が暗い場合は、周囲の明るさに影響されない情報（体で反射した反射音や声）を話者認証に用いる生体情報として選択することができる。また、周囲がうるさい場合は、周囲のうるささに影響されない情報（体の一部の画像や指紋）を話者認証に用いる生体情報として選択することができる。かかる場合、周囲環境に影響されない高精度な話者認証が実現できる。 According to the communication system of the present embodiment described above, the same effects as those of the first and second embodiments are realized. Further, according to the call system of the present embodiment, biometric information used for speaker authentication can be selected based on the surrounding environment of the call terminal 10. For example, when the surroundings are dark, information that is not affected by the brightness of the surroundings (reflected sound or voice reflected by the body) can be selected as biometric information used for speaker authentication. When the surroundings are noisy, information that is not affected by the surroundings (images and fingerprints of a part of the body) can be selected as biometric information used for speaker authentication. In such a case, highly accurate speaker authentication that is not affected by the surrounding environment can be realized.

＜第４の実施形態＞
本実施形態の通話システムでは、内線通話中に話者識別を行い、外線通話中に話者識別を行わない点で、第１乃至第３の実施形態と異なる。その他の構成は第１乃至第３の実施形態と同様である。 <Fourth Embodiment>
The call system of the present embodiment is different from the first to third embodiments in that the speaker is identified during the extension call and the speaker is not identified during the outside line call. Other configurations are the same as those of the first to third embodiments.

通話端末１０及び話者識別サーバ２０のハードウエア構成の一例は、第１乃至第３の実施形態と同様である。 An example of the hardware configuration of the call terminal 10 and the speaker identification server 20 is the same as that of the first to third embodiments.

まず、通話端末１０の機能構成を説明する。通話端末１０の機能ブロック図の一例は、図２で示される。図示するように、通話端末１０は、通話部１１と、生体情報取得部１２と、第１の送信部１３とを有する。なお、通話端末１０は、さらに、受信部１４及び第２の送信部１５を有してもよい。また、通話端末１０は、受信部１４及び第２の送信部１５に加えて、又は代えて、選択部１６を有してもよい。通話部１１、受信部１４、第２の送信部１５及び選択部１６の機能構成は、第１乃至第３の実施形態と同様である。 First, the functional configuration of the call terminal 10 will be described. An example of the functional block diagram of the call terminal 10 is shown in FIG. As shown in the figure, the telephone terminal 10 has a telephone unit 11, a biological information acquisition unit 12, and a first transmission unit 13. The telephone terminal 10 may further include a receiving unit 14 and a second transmitting unit 15. Further, the telephone terminal 10 may have a selection unit 16 in addition to or in place of the reception unit 14 and the second transmission unit 15. The functional configurations of the calling unit 11, the receiving unit 14, the second transmitting unit 15, and the selecting unit 16 are the same as those in the first to third embodiments.

生体情報取得部１２は、内線通話中に生体情報を繰り返し取得する。そして、第１の送信部１３は、内線通話中に話者識別情報を話者識別サーバ２０に送信する。なお、生体情報取得部１２は、外線通話中に生体情報を取得しない。そして、第１の送信部１３は、外線通話中に話者識別情報を話者識別サーバ２０に送信しない。生体情報取得部１２及び第１の送信部１３のその他の機能構成は、第１乃至第３の実施形態と同様である。 The biometric information acquisition unit 12 repeatedly acquires biometric information during an extension call. Then, the first transmission unit 13 transmits the speaker identification information to the speaker identification server 20 during the extension call. The biometric information acquisition unit 12 does not acquire biometric information during an outside line call. Then, the first transmission unit 13 does not transmit the speaker identification information to the speaker identification server 20 during the outside line call. Other functional configurations of the biological information acquisition unit 12 and the first transmission unit 13 are the same as those of the first to third embodiments.

話者識別サーバ２０の機能構成は、第１乃至第３の実施形態と同様である。 The functional configuration of the speaker identification server 20 is the same as that of the first to third embodiments.

以上説明した本実施形態の通話システムによれば、第１乃至第３の実施形態と同様の作用効果を実現できる。また、本実施形態の通話システムによれば、閉じられた所定のエリア内（例：社内等）でのみ、話者識別機能を利用することができる。すなわち、利用エリアを制限することができる。 According to the communication system of the present embodiment described above, the same operation and effect as those of the first to third embodiments can be realized. Further, according to the call system of the present embodiment, the speaker identification function can be used only within a closed predetermined area (eg, in-house or the like). That is, the usage area can be restricted.

＜第５の実施形態＞
本実施形態の通話システムでは、通話端末１０は所定の機能を有する通話端末と通話中に話者識別のための処理（生体情報の取得、話者識別情報の送信等）を行い、所定の機能を有さない通話端末と通話中に話者識別のための処理を行わない点で、第１乃至第４の実施形態と異なる。その他の構成は第１乃至第４の実施形態と同様である。 <Fifth Embodiment>
In the call system of the present embodiment, the call terminal 10 performs a process for identifying a speaker (acquisition of biometric information, transmission of speaker identification information, etc.) during a call with a call terminal having a predetermined function, and has a predetermined function. It is different from the first to fourth embodiments in that the process for identifying the speaker is not performed during a call with a call terminal that does not have the above. Other configurations are the same as those of the first to fourth embodiments.

通話端末１０及び話者識別サーバ２０のハードウエア構成の一例は、第１乃至第４の実施形態と同様である。 An example of the hardware configuration of the call terminal 10 and the speaker identification server 20 is the same as that of the first to fourth embodiments.

まず、通話端末１０の機能構成を説明する。通話端末１０の機能ブロック図の一例は、図２で示される。図示するように、通話端末１０は、通話部１１と、生体情報取得部１２と、第１の送信部１３とを有する。なお、通話端末１０は、さらに、受信部１４及び第２の送信部１５を有してもよい。また、通話端末１０は、受信部１４及び第２の送信部１５に加えて、又は代えて、選択部１６を有してもよい。通話部１１、受信部１４、第２の送信部１５及び選択部１６の機能構成は、第１乃至第４の実施形態と同様である。 First, the functional configuration of the call terminal 10 will be described. An example of the functional block diagram of the call terminal 10 is shown in FIG. As shown in the figure, the telephone terminal 10 has a telephone unit 11, a biological information acquisition unit 12, and a first transmission unit 13. The telephone terminal 10 may further include a receiving unit 14 and a second transmitting unit 15. Further, the telephone terminal 10 may have a selection unit 16 in addition to or in place of the reception unit 14 and the second transmission unit 15. The functional configurations of the calling unit 11, the receiving unit 14, the second transmitting unit 15, and the selecting unit 16 are the same as those in the first to fourth embodiments.

生体情報取得部１２は、所定の機能を有する通話端末と通話中に、生体情報を繰り返し取得する。そして、第１の送信部１３は、所定の機能を有する通話端末と通話中に、話者識別情報を話者識別サーバ２０に送信する。なお、生体情報取得部１２は、所定の機能を有さない通話端末と通話中に、生体情報を取得しない。そして、第１の送信部１３は、所定の機能を有さない通話端末と通話中に、話者識別情報を話者識別サーバ２０に送信しない。なお、通話端末１０は、発呼操作後かつ通話確立前、又は、通話確立後に、通話相手端末から所定の機能を有するか否かを示す情報を受信してもよい。そして、生体情報取得部１２は、当該情報に基づき、通話相手端末が所定の機能を有するか否かを判断してもよい。生体情報取得部１２及び第１の送信部１３のその他の機能構成は、第１乃至第４の実施形態と同様である。 The biometric information acquisition unit 12 repeatedly acquires biometric information during a call with a call terminal having a predetermined function. Then, the first transmission unit 13 transmits the speaker identification information to the speaker identification server 20 during a call with a call terminal having a predetermined function. The biometric information acquisition unit 12 does not acquire biometric information during a call with a call terminal that does not have a predetermined function. Then, the first transmission unit 13 does not transmit the speaker identification information to the speaker identification server 20 during a call with a call terminal that does not have a predetermined function. The call terminal 10 may receive information indicating whether or not it has a predetermined function from the other party's terminal after the call is made and before the call is established or after the call is established. Then, the biometric information acquisition unit 12 may determine whether or not the other party's terminal has a predetermined function based on the information. Other functional configurations of the biological information acquisition unit 12 and the first transmission unit 13 are the same as those of the first to fourth embodiments.

所定の機能は、話者識別機能である。話者識別機能は、生体情報取得部１２及び第１の送信部１３により実現される機能である。なお、話者識別機能は、生体情報取得部１２、第１の送信部１３、受信部１４及び第２の送信部１５により実現される機能であってもよい。 A predetermined function is a speaker identification function. The speaker identification function is a function realized by the biological information acquisition unit 12 and the first transmission unit 13. The speaker identification function may be a function realized by the biological information acquisition unit 12, the first transmission unit 13, the reception unit 14, and the second transmission unit 15.

話者識別サーバ２０の機能構成は、第１乃至第４の実施形態と同様である。 The functional configuration of the speaker identification server 20 is the same as that of the first to fourth embodiments.

以上説明した本実施形態の通話システムによれば、第１乃至第４の実施形態と同様の作用効果を実現できる。また、本実施形態の通話システムによれば、通話中の２つの通話端末が互いに話者識別機能を有する場合に話者識別を行い、通話中の２つの通話端末の少なくとも一方が話者識別機能を有さない場合に話者識別を行わないようにできる。かかる場合、一方のみが自身を示す情報を相手方に伝える不公平を回避できる。 According to the communication system of the present embodiment described above, the same operation and effect as those of the first to fourth embodiments can be realized. Further, according to the call system of the present embodiment, speaker identification is performed when two call terminals in a call have a speaker identification function with each other, and at least one of the two call terminals in a call has a speaker identification function. It is possible to prevent speaker identification when the speaker is not identified. In such a case, it is possible to avoid unfairness in which only one person conveys information indicating himself / herself to the other party.

＜第６の実施形態＞
本実施形態の通話システムでは、通話端末１０は所定の電話番号の通話端末と通話中に話者識別のための処理（生体情報の取得、話者識別情報の送信等）を行い、所定の電話番号でない通話端末と通話中に話者識別のための処理を行わない点で、第１乃至第５の実施形態と異なる。その他の構成は第１乃至第５の実施形態と同様である。 <Sixth Embodiment>
In the call system of the present embodiment, the call terminal 10 performs a process for identifying a speaker (acquisition of biometric information, transmission of speaker identification information, etc.) during a call with a call terminal having a predetermined telephone number, and makes a predetermined telephone call. It differs from the first to fifth embodiments in that a process for identifying a speaker is not performed during a call with a call terminal that is not a number. Other configurations are the same as those of the first to fifth embodiments.

通話端末１０及び話者識別サーバ２０のハードウエア構成の一例は、第１乃至第５の実施形態と同様である。 An example of the hardware configuration of the call terminal 10 and the speaker identification server 20 is the same as that of the first to fifth embodiments.

まず、通話端末１０の機能構成を説明する。通話端末１０の機能ブロック図の一例は、図２で示される。図示するように、通話端末１０は、通話部１１と、生体情報取得部１２と、第１の送信部１３とを有する。なお、通話端末１０は、さらに、受信部１４及び第２の送信部１５を有してもよい。また、通話端末１０は、受信部１４及び第２の送信部１５に加えて、又は代えて、選択部１６を有してもよい。通話部１１、受信部１４、第２の送信部１５及び選択部１６の機能構成は、第１乃至第５の実施形態と同様である。 First, the functional configuration of the call terminal 10 will be described. An example of the functional block diagram of the call terminal 10 is shown in FIG. As shown in the figure, the telephone terminal 10 has a telephone unit 11, a biological information acquisition unit 12, and a first transmission unit 13. The telephone terminal 10 may further include a receiving unit 14 and a second transmitting unit 15. Further, the telephone terminal 10 may have a selection unit 16 in addition to or in place of the reception unit 14 and the second transmission unit 15. The functional configurations of the calling unit 11, the receiving unit 14, the second transmitting unit 15, and the selecting unit 16 are the same as those in the first to fifth embodiments.

生体情報取得部１２は、所定の電話番号の通話端末と通話中に、生体情報を繰り返し取得する。そして、第１の送信部１３は、所定の電話番号の通話端末と通話中に、話者識別情報を話者識別サーバ２０に送信する。なお、生体情報取得部１２は、所定の電話番号でない通話端末と通話中に、生体情報を取得しない。そして、第１の送信部１３は、所定の電話番号でない通話端末と通話中に、話者識別情報を話者識別サーバ２０に送信しない。生体情報取得部１２及び第１の送信部１３のその他の機能構成は、第１乃至第５の実施形態と同様である。 The biometric information acquisition unit 12 repeatedly acquires biometric information during a call with a call terminal having a predetermined telephone number. Then, the first transmission unit 13 transmits the speaker identification information to the speaker identification server 20 during a call with the call terminal having a predetermined telephone number. The biometric information acquisition unit 12 does not acquire biometric information during a call with a call terminal that does not have a predetermined telephone number. Then, the first transmission unit 13 does not transmit the speaker identification information to the speaker identification server 20 during a call with a call terminal having a non-predetermined telephone number. Other functional configurations of the biological information acquisition unit 12 and the first transmission unit 13 are the same as those of the first to fifth embodiments.

例えば、ユーザは、予め自身の氏名等を通知してもよい電話番号のリスト（ホワイトリスト）を作成し、通話端末１０に登録しておいてもよい。この場合、ホワイトリストに登録された電話番号が所定の電話番号となる。その他、ユーザは、予め自身の氏名等を通知することを拒否する電話番号のリスト（ブラックリスト）を作成し、通話端末１０に登録しておいてもよい。この場合、ブラックリストに登録されていない電話番号が所定の電話番号となる。 For example, the user may create a list (white list) of telephone numbers for which his / her name or the like may be notified in advance and register it in the telephone terminal 10. In this case, the telephone number registered in the white list becomes the predetermined telephone number. In addition, the user may create a list (blacklist) of telephone numbers that refuse to notify his / her name or the like in advance and register the telephone number in the telephone terminal 10. In this case, the telephone number not registered in the blacklist becomes the predetermined telephone number.

話者識別サーバ２０の機能構成は、第１乃至第５の実施形態と同様である。 The functional configuration of the speaker identification server 20 is the same as that of the first to fifth embodiments.

以上説明した本実施形態の通話システムによれば、第１乃至第５の実施形態と同様の作用効果を実現できる。また、本実施形態の通話システムによれば、ユーザは、自身の氏名等を伝える相手と、伝えない相手を決定することができる。このため、当該機能を利用して個人情報が不当に取得される不都合を回避できる。 According to the communication system of the present embodiment described above, the same operation and effect as those of the first to fifth embodiments can be realized. Further, according to the call system of the present embodiment, the user can determine a person who conveys his / her name and the like and a person who does not convey his / her name. Therefore, it is possible to avoid the inconvenience that personal information is illegally acquired by using the function.

＜第７の実施形態＞
本実施形態の通話システムは、話者識別サーバ２０が識別結果を通話履歴として登録する機能を有する点で、第１乃至第６の実施形態と異なる。その他の構成は第１乃至第６の実施形態と同様である。 <7th Embodiment>
The call system of the present embodiment is different from the first to sixth embodiments in that the speaker identification server 20 has a function of registering the identification result as a call history. Other configurations are the same as those of the first to sixth embodiments.

通話端末１０及び話者識別サーバ２０のハードウエア構成の一例は、第１乃至第６の実施形態と同様である。 An example of the hardware configuration of the call terminal 10 and the speaker identification server 20 is the same as that of the first to sixth embodiments.

通話端末１０の機能構成は、第１乃至第６の実施形態と同様である。 The functional configuration of the call terminal 10 is the same as that of the first to sixth embodiments.

次に、話者識別サーバ２０の機能構成を説明する。図１１の機能ブロック図に示すように、話者識別サーバ２０は、受信部２１と、話者識別部２２と、送信部２３と、通話履歴登録部２４とを有する。受信部２１、話者識別部２２及び送信部２３の機能構成は、第１乃至第６の実施形態と同様である。 Next, the functional configuration of the speaker identification server 20 will be described. As shown in the functional block diagram of FIG. 11, the speaker identification server 20 includes a reception unit 21, a speaker identification unit 22, a transmission unit 23, and a call history registration unit 24. The functional configurations of the receiving unit 21, the speaker identification unit 22, and the transmitting unit 23 are the same as those in the first to sixth embodiments.

通話履歴登録部２４は、話者識別部２２による識別結果を通話履歴として登録する。図１２に、通話履歴登録部２４により登録される情報の一例を模式的に示す。図示する情報は、通話開始日時、通話終了日時、発呼側通話端末（第１の通話端末）のＩＤ（identifier）、発呼側通話端末の話者、着呼側通話端末（第２の通話端末）のＩＤ、及び、着呼側通話端末の話者を互いに対応付けている。発呼側通話端末の話者は、発呼側通話端末から話者識別サーバ２０に送信された話者識別情報に基づき識別された話者である。着呼側通話端末の話者は、着呼側通話端末から話者識別サーバ２０に送信された話者識別情報に基づき識別された話者である。 The call history registration unit 24 registers the identification result of the speaker identification unit 22 as a call history. FIG. 12 schematically shows an example of information registered by the call history registration unit 24. The illustrated information includes the call start date and time, the call end date and time, the ID (identifier) of the calling side calling terminal (first calling terminal), the speaker of the calling side calling terminal, and the called side calling terminal (second call). The ID of the terminal) and the speaker of the calling terminal are associated with each other. The speaker of the calling party calling terminal is a speaker identified based on the speaker identification information transmitted from the calling party calling terminal to the speaker identification server 20. The speaker of the called party calling terminal is a speaker identified based on the speaker identification information transmitted from the called party calling terminal to the speaker identification server 20.

図１２に示すように、通話履歴登録部２４は、通話中の２つの通話端末１０各々に対応して識別された話者を、互いに対応付けて登録してもよい。当該情報によれば、だれとだれが電話する関係かを示す人間マップが得られる。 As shown in FIG. 12, the call history registration unit 24 may register speakers identified corresponding to each of the two call terminals 10 during a call in association with each other. The information provides a human map showing who and who is calling.

また、図１２に示すように、通話履歴登録部２４は、ある通話端末１０に対応して識別された話者を、その通話端末１０に対応付けて登録してもよい。当該情報によれば、ある通話端末１０を利用する者を特定できる。また、ある通話端末１０をよく利用する者や、たまに利用する者、ほとんど利用しない者、全く利用しない者等を特定できる。 Further, as shown in FIG. 12, the call history registration unit 24 may register the speaker identified corresponding to a certain call terminal 10 in association with the call terminal 10. According to the information, a person who uses a certain telephone terminal 10 can be identified. Further, it is possible to identify a person who frequently uses a certain call terminal 10, a person who uses it occasionally, a person who rarely uses it, a person who does not use it at all, and the like.

なお、「ある通話端末１０に対応して識別された話者」は、その通話端末１０から話者識別サーバ２０に送信された話者識別情報に基づき識別された話者である。 The "speaker identified corresponding to a certain call terminal 10" is a speaker identified based on the speaker identification information transmitted from the call terminal 10 to the speaker identification server 20.

また、図１２に示すように、通話履歴登録部２４は、ある通話端末１０に対応して識別された複数の話者を、互いに対応付けて登録してもよい。例えば、通話履歴登録部２４は、１つの通話中に、ある通話端末１０に対応して識別された複数の話者を、互いに対応付けて登録してもよい。当該情報によれば、だれとだれが通話を途中で変わる関係かを示す人間マップが得られる。 Further, as shown in FIG. 12, the call history registration unit 24 may register a plurality of speakers identified corresponding to a certain call terminal 10 in association with each other. For example, the call history registration unit 24 may register a plurality of speakers identified corresponding to a certain call terminal 10 in association with each other during one call. The information provides a human map showing who and who are in the middle of a call.

また、通話履歴登録部２４により登録される情報は、ある人がある時刻に電話していたことを証明する資料としても利用することができる。例えば、電話会議などの場合には、当該会議に参加していた者を確認したり、証明したりできる。 In addition, the information registered by the call history registration unit 24 can also be used as a material for proving that a certain person was calling at a certain time. For example, in the case of a telephone conference, it is possible to confirm or prove the person who participated in the conference.

以上説明した本実施形態の通話システムによれば、第１乃至第６の実施形態と同様の作用効果を実現できる。また、本実施形態の通話システムによれば、話者識別の結果を通話履歴として登録できる。登録された情報によれば、人間関係や、人と通話端末１０との関係等を把握することができる。また、登録された情報を、所定の事実の証明として利用することもできる。 According to the communication system of the present embodiment described above, the same operation and effect as those of the first to sixth embodiments can be realized. Further, according to the call system of the present embodiment, the result of speaker identification can be registered as a call history. According to the registered information, it is possible to grasp the human relationship, the relationship between the person and the calling terminal 10, and the like. In addition, the registered information can be used as proof of a predetermined fact.

以下、参考形態の例を付記する。
１．他の通話端末と通話する通話手段と、
通話中、生体情報を繰り返し取得する生体情報取得手段と、
通話中、前記生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を、前記話者識別情報に基づき話者を識別する話者識別サーバに繰り返し送信する第１の送信手段と、
を有する通話端末。
２．１に記載の通話端末において、
前記話者識別情報に基づき識別された話者を示す情報を、前記話者識別サーバから受信する受信手段と、
前記話者を示す情報を通話相手の通話端末に送信する第２の送信手段と、
を有する通話端末。
３．１又は２に記載の通話端末において、
前記生体情報取得手段は、体の一部の画像、体で反射した反射音、指紋又は声を取得する手段を有する通話端末。
４．１又は２に記載の通話端末において、
前記生体情報取得手段は、体の一部の画像、体で反射した反射音、指紋及び声の中の少なくとも２つを取得する手段を有し、
周囲環境に基づき、前記話者識別サーバに送信する前記生体情報の種類を選択する選択手段を有し、
前記第１の送信手段は、前記選択手段により選択された種類の前記生体情報、又は、前記選択手段により選択された種類の前記生体情報から抽出された特徴量である前記話者識別情報を前記話者識別サーバに送信する通話端末。
５．４に記載の通話端末において、
前記選択手段は、周囲が所定レベルより暗い場合、体で反射した反射音又は声を選択する通話端末。
６．４又は５に記載の通話端末において、
前記選択手段は、周囲の騒音レベルが閾値以上である場合、体の一部の画像または指紋を選択する通話端末。
７．１から６のいずれかに記載の通話端末において、
前記生体情報取得手段は、発呼操作後かつ通話開始前に、前記生体情報を取得し、
前記第１の送信手段は、発呼操作後かつ通話開始前に、前記話者識別情報を前記話者識別サーバに送信する通話端末。
８．１から７のいずれかに記載の通話端末において、
内線通話中に、前記生体情報取得手段は前記生体情報を繰り返し取得し、前記第１の送信手段は前記話者識別情報を前記話者識別サーバに送信し、
外線通話中に、前記生体情報取得手段は前記生体情報を取得せず、前記第１の送信手段は前記話者識別情報を前記話者識別サーバに送信しない通話端末。
９．１から８のいずれかに記載の通話端末において、
所定の機能を有する通話端末と通話中に、前記生体情報取得手段は前記生体情報を繰り返し取得し、前記第１の送信手段は前記話者識別情報を前記話者識別サーバに送信し、
所定の機能を有さない通話端末と通話中に、前記生体情報取得手段は前記生体情報を取得せず、前記第１の送信手段は前記話者識別情報を前記話者識別サーバに送信しない通話端末。
１０．１から９のいずれかに記載の通話端末において、
所定の電話番号の通話端末と通話中に、前記生体情報取得手段は前記生体情報を繰り返し取得し、前記第１の送信手段は前記話者識別情報を前記話者識別サーバに送信し、
所定の電話番号でない通話端末と通話中に、前記生体情報取得手段は前記生体情報を取得せず、前記第１の送信手段は前記話者識別情報を前記話者識別サーバに送信しない通話端末。
１１．通話中の通話端末から、通話中に繰り返し取得された生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を受信する受信手段と、
前記話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する話者識別手段と、
識別した話者を示す情報を、前記話者識別情報の送信元の通話端末、又は、前記話者識別情報の送信元の通話端末と通話中の通話端末に送信する送信手段と、
を有する話者識別サーバ。
１２．１１に記載の話者識別サーバにおいて、
前記話者識別手段による識別結果を登録する通話履歴登録手段を有する話者識別サーバ。
１３．１２に記載の話者識別サーバにおいて、
前記通話履歴登録手段は、通話中の２つの通話端末各々に対応して識別された話者を、互いに対応付けて登録する話者識別サーバ。
１４．１２又は１３に記載の話者識別サーバにおいて、
前記通話履歴登録手段は、ある通話端末に対応して識別された話者を、その通話端末に対応付けて登録する話者識別サーバ。
１５．１２から１４のいずれかに記載の話者識別サーバにおいて、
前記通話履歴登録手段は、ある通話端末に対応して識別された複数の話者を、互いに対応付けて登録する話者識別サーバ。
１６．１５に記載の話者識別サーバにおいて、
前記通話履歴登録手段は、１つの通話中に、ある通話端末に対応して識別された複数の話者を、互いに対応付けて登録する話者識別サーバ。
１７．１１から１６のいずれかに記載の話者識別サーバにおいて、
前記受信手段は、発呼操作後かつ通話開始前に前記通話端末から前記話者識別情報を受信し、
前記送信手段は、前記識別した話者を示す情報を、前記話者識別情報の送信元の通話端末、又は、前記話者識別情報の送信元の通話端末の発呼先の通話端末に送信する話者識別サーバ。
１８．通話端末と話者識別サーバとを有し、
前記通話端末は、
他の通話端末と通話する通話手段と、
通話中、生体情報を繰り返し取得する生体情報取得手段と、
通話中、前記生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を前記話者識別サーバに繰り返し送信する第１の送信手段と、
を有し、
前記話者識別サーバは、
通話中の前記通話端末から、前記話者識別情報を受信する受信手段と、
前記話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する話者識別手段と、
識別した話者を示す情報を、前記話者識別情報の送信元の通話端末、又は、前記話者識別情報の送信元の通話端末と通話中の通話端末に送信する送信手段と、
を有する通話システム。
１９．コンピュータが、
他の通話端末と通話する通話工程と、
通話中、生体情報を繰り返し取得する生体情報取得工程と、
通話中、前記生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を、前記話者識別情報に基づき話者を識別する話者識別サーバに繰り返し送信する第１の送信工程と、
を実行する通話端末の処理方法。
２０．コンピュータを、
他の通話端末と通話する通話手段、
通話中、生体情報を繰り返し取得する生体情報取得手段、
通話中、前記生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を、前記話者識別情報に基づき話者を識別する話者識別サーバに繰り返し送信する第１の送信手段、
として機能させるプログラム。
２１．コンピュータが、
通話中の通話端末から、通話中に繰り返し取得された生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を受信する受信工程と、
前記話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する話者識別工程と、
識別した話者を示す情報を、前記話者識別情報の送信元の通話端末、又は、前記話者識別情報の送信元の通話端末と通話中の通話端末に送信する送信工程と、
を実行する話者識別サーバの処理方法。
２２．コンピュータを、
通話中の通話端末から、通話中に繰り返し取得された生体情報、又は、前記生体情報から抽出された特徴量である話者識別情報を受信する受信手段、
前記話者識別情報と、予め登録されている参照情報とに基づき、話者を識別する話者識別手段、
識別した話者を示す情報を、前記話者識別情報の送信元の通話端末、又は、前記話者識別情報の送信元の通話端末と通話中の通話端末に送信する送信手段、
として機能させるプログラム。 Hereinafter, an example of the reference form will be added.
1. 1. A means of calling with other calling terminals,
Biometric information acquisition means for repeatedly acquiring biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. Means and
Call terminal with.
2. 2. In the call terminal according to 1.
A receiving means for receiving information indicating a speaker identified based on the speaker identification information from the speaker identification server, and
A second transmission means for transmitting information indicating the speaker to the call terminal of the other party,
Call terminal with.
3. 3. In the call terminal described in 1 or 2,
The biological information acquisition means is a communication terminal having a means for acquiring an image of a part of a body, a reflected sound reflected by the body, a fingerprint, or a voice.
4. In the call terminal described in 1 or 2,
The biological information acquisition means has means for acquiring at least two of an image of a part of the body, a reflected sound reflected by the body, a fingerprint, and a voice.
It has a selection means for selecting the type of biometric information to be transmitted to the speaker identification server based on the surrounding environment.
The first transmission means uses the speaker identification information, which is a feature amount extracted from the biological information of the type selected by the selection means or the biological information of the type selected by the selection means. A calling terminal that sends to the speaker identification server.
5. In the call terminal according to 4.
The selection means is a telephone terminal that selects a reflected sound or voice reflected by the body when the surroundings are darker than a predetermined level.
6. In the calling terminal described in 4 or 5,
The selection means is a telephone terminal that selects an image or fingerprint of a part of the body when the ambient noise level is equal to or higher than a threshold value.
7. In the call terminal according to any one of 1 to 6,
The biometric information acquisition means acquires the biometric information after the call operation and before the start of the call.
The first transmission means is a call terminal that transmits the speaker identification information to the speaker identification server after a call call operation and before the start of a call.
8. In the call terminal described in any one of 1 to 7,
During the extension call, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server.
A call terminal in which the biometric information acquisition means does not acquire the biometric information during an outside line call, and the first transmission means does not transmit the speaker identification information to the speaker identification server.
9. In the call terminal according to any one of 1 to 8,
During a call with a call terminal having a predetermined function, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server.
During a call with a call terminal that does not have a predetermined function, the biometric information acquisition means does not acquire the biometric information, and the first transmission means does not transmit the speaker identification information to the speaker identification server. Terminal.
10. In the call terminal according to any one of 1 to 9.
During a call with a call terminal having a predetermined telephone number, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server.
A call terminal in which the biometric information acquisition means does not acquire the biometric information and the first transmission means does not transmit the speaker identification information to the speaker identification server during a call with a call terminal that does not have a predetermined telephone number.
11. A receiving means for receiving biometric information repeatedly acquired during a call or speaker identification information which is a feature amount extracted from the biometric information from a calling terminal during a call.
A speaker identification means for identifying a speaker based on the speaker identification information and pre-registered reference information,
A transmission means for transmitting information indicating the identified speaker to the calling terminal of the sender of the speaker identification information, or to the calling terminal of the sender of the speaker identification information and the calling terminal during a call.
Speaker identification server with.
12. In the speaker identification server according to 11.
A speaker identification server having a call history registration means for registering an identification result by the speaker identification means.
13. In the speaker identification server according to 12.
The call history registration means is a speaker identification server that registers speakers identified corresponding to each of two call terminals during a call in association with each other.
14. In the speaker identification server according to 12 or 13,
The call history registration means is a speaker identification server that registers a speaker identified corresponding to a certain call terminal in association with the call terminal.
15. In the speaker identification server according to any one of 12 to 14.
The call history registration means is a speaker identification server that registers a plurality of speakers identified corresponding to a certain call terminal in association with each other.
16. In the speaker identification server according to 15.
The call history registration means is a speaker identification server that registers a plurality of speakers identified corresponding to a certain call terminal in association with each other during one call.
17. In the speaker identification server according to any one of 11 to 16.
The receiving means receives the speaker identification information from the calling terminal after the calling operation and before the start of the call, and the receiving means receives the speaker identification information.
The transmitting means transmits information indicating the identified speaker to the calling terminal of the source of the speaker identification information or the calling terminal of the calling terminal of the source of the speaker identification information. Speaker identification server.
18. It has a call terminal and a speaker identification server,
The call terminal is
A means of calling with other calling terminals,
Biometric information acquisition means for repeatedly acquiring biometric information during a call,
A first transmission means for repeatedly transmitting the biological information or speaker identification information, which is a feature amount extracted from the biological information, to the speaker identification server during a call.
Have,
The speaker identification server is
A receiving means for receiving the speaker identification information from the calling terminal during a call, and
A speaker identification means for identifying a speaker based on the speaker identification information and pre-registered reference information,
A transmission means for transmitting information indicating the identified speaker to the calling terminal of the sender of the speaker identification information, or to the calling terminal of the sender of the speaker identification information and the calling terminal during a call.
Call system with.
19. The computer
The call process of talking to other call terminals and
The biometric information acquisition process, which repeatedly acquires biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. Process and
How to handle the calling terminal that executes.
20. Computer,
A means of calling to talk to other calling terminals,
Biometric information acquisition means that repeatedly acquires biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. means,
A program that functions as.
21. The computer
A receiving process of receiving biometric information repeatedly acquired during a call or speaker identification information, which is a feature amount extracted from the biometric information, from a calling terminal during a call.
A speaker identification process for identifying a speaker based on the speaker identification information and pre-registered reference information,
A transmission step of transmitting information indicating the identified speaker to the call terminal of the sender of the speaker identification information or the call terminal of the sender of the speaker identification information and the call terminal in conversation.
How to handle the speaker identification server that runs.
22. Computer,
A receiving means for receiving biometric information repeatedly acquired during a call or speaker identification information which is a feature amount extracted from the biometric information from a calling terminal during a call.
A speaker identification means for identifying a speaker based on the speaker identification information and the reference information registered in advance.
A transmission means for transmitting information indicating an identified speaker to a calling terminal that is a source of the speaker identification information, or a calling terminal that is in a call with a calling terminal that is a source of the speaker identification information.
A program that functions as.

１Ａプロセッサ
２Ａメモリ
３Ａ入出力Ｉ／Ｆ
４Ａ周辺回路
５Ａバス
１０通話端末
１１通話部
１２生体情報取得部
１３第１の送信部
１４受信部
１５第２の送信部
１６選択部
２０話者識別サーバ
２１受信部
２２話者識別部
２３送信部
２４通話履歴登録部 1A processor 2A memory 3A input / output I / F
4A Peripheral circuit 5A Bus 10 Call terminal 11 Call section 12 Biometric information acquisition section 13 First transmitter 14 Receiver 15 Second transmitter 16 Select section 20 Speaker identification server 21 Receiver 22 Speaker identification section 23 Transmitter 24 Call history registration unit

Claims

A means of calling with other calling terminals,
Biometric information acquisition means for repeatedly acquiring biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. Means and
Have,
During the extension call, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server.
A call terminal in which the biometric information acquisition means does not acquire the biometric information during an outside line call, and the first transmission means does not transmit the speaker identification information to the speaker identification server .

A means of calling with other calling terminals,
Biometric information acquisition means for repeatedly acquiring biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. Means and
Have,
During a call with a call terminal having a predetermined telephone number, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server.
A call terminal in which the biometric information acquisition means does not acquire the biometric information and the first transmission means does not transmit the speaker identification information to the speaker identification server during a call with a call terminal that does not have a predetermined telephone number.

In the call terminal according to claim 1 or 2 .
A receiving means for receiving information indicating a speaker identified based on the speaker identification information from the speaker identification server, and
A second transmission means for transmitting information indicating the speaker to the call terminal of the other party,
Call terminal with.

In the telephone terminal according to any one of claims 1 to 3 ,
The biological information acquisition means is a communication terminal having a means for acquiring an image of a part of a body, a reflected sound reflected by the body, a fingerprint, or a voice.

In the telephone terminal according to any one of claims 1 to 4 .
The biological information acquisition means has means for acquiring at least two of an image of a part of the body, a reflected sound reflected by the body, a fingerprint, and a voice.
It has a selection means for selecting the type of biometric information to be transmitted to the speaker identification server based on the surrounding environment.
The first transmission means uses the speaker identification information, which is a feature amount extracted from the biological information of the type selected by the selection means or the biological information of the type selected by the selection means. A calling terminal that sends to the speaker identification server.

In the call terminal according to claim 5 ,
The selection means is a telephone terminal that selects a reflected sound or voice reflected by the body when the surroundings are darker than a predetermined level.

In the call terminal according to claim 5 or 6 .
The selection means is a telephone terminal that selects an image or fingerprint of a part of the body when the ambient noise level is equal to or higher than a threshold value.

In the telephone terminal according to any one of claims 1 to 7 .
The biometric information acquisition means acquires the biometric information after the call operation and before the start of the call.
The first transmission means is a call terminal that transmits the speaker identification information to the speaker identification server after a call call operation and before the start of a call.

In the telephone terminal according to any one of claims 1 to 8.
During a call with a call terminal having a predetermined function, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server.
During a call with a call terminal that does not have a predetermined function, the biometric information acquisition means does not acquire the biometric information, and the first transmission means does not transmit the speaker identification information to the speaker identification server. Terminal.

A receiving means for receiving biometric information repeatedly acquired during a call or speaker identification information which is a feature amount extracted from the biometric information from a calling terminal during a call.
A speaker identification means for identifying a speaker based on the speaker identification information received by the receiving means and the reference information registered in advance.
Information indicating the speaker identified by the speaker identification means is transmitted to the call terminal of the sender of the speaker identification information or the call terminal in talk with the call terminal of the sender of the speaker identification information. And the means of transmission
Have,
The receiving means receives the biometric information and the speaker identification information from the call terminal during an extension call, and does not receive the biometric information and the speaker identification information from the call terminal during an outside line call. Speaker identification server.

A receiving means for receiving biometric information repeatedly acquired during a call or speaker identification information which is a feature amount extracted from the biometric information from a calling terminal during a call.
A speaker identification means for identifying a speaker based on the speaker identification information received by the receiving means and the reference information registered in advance.
Information indicating the speaker identified by the speaker identification means is transmitted to the call terminal of the sender of the speaker identification information or the call terminal in talk with the call terminal of the sender of the speaker identification information. And the means of transmission
Have,
The receiving means receives the biometric information and the speaker identification information from the calling terminal during a call with the calling terminal having a predetermined telephone number, and from the calling terminal during a call with a calling terminal having a non-predetermined telephone number. A speaker identification server that does not receive the biometric information and the speaker identification information .

In the speaker identification server according to claim 10 or 11 .
A speaker identification server having a call history registration means for registering an identification result by the speaker identification means.

In the speaker identification server according to claim 12,
The call history registration means is a speaker identification server that registers speakers identified corresponding to each of two call terminals during a call in association with each other.

In the speaker identification server according to claim 12 or 13.
The call history registration means is a speaker identification server that registers a speaker identified corresponding to a certain call terminal in association with the call terminal.

In the speaker identification server according to any one of claims 12 to 14,
The call history registration means is a speaker identification server that registers a plurality of speakers identified corresponding to a certain call terminal in association with each other.

In the speaker identification server according to claim 15,
The call history registration means is a speaker identification server that registers a plurality of speakers identified corresponding to a certain call terminal in association with each other during one call.

In the speaker identification server according to any one of claims 11 to 16.
The receiving means receives the speaker identification information from the calling terminal after the calling operation and before the start of the call, and the receiving means receives the speaker identification information.
The transmitting means transmits information indicating the identified speaker to the calling terminal of the source of the speaker identification information or the calling terminal of the calling terminal of the source of the speaker identification information. Speaker identification server.

It has a call terminal and a speaker identification server,
The call terminal is
A means of calling with other calling terminals,
Biometric information acquisition means for repeatedly acquiring biometric information during a call,
A first transmission means for repeatedly transmitting the biological information or speaker identification information, which is a feature amount extracted from the biological information, to the speaker identification server during a call.
Have,
During the extension call, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server.
During an outside line call, the biometric information acquisition means does not acquire the biometric information, and the first transmission means does not transmit the speaker identification information to the speaker identification server.
The speaker identification server is
A receiving means for receiving the speaker identification information from the calling terminal during a call, and
A speaker identification means for identifying a speaker based on the speaker identification information and pre-registered reference information,
A transmission means for transmitting information indicating the identified speaker to the calling terminal of the sender of the speaker identification information, or to the calling terminal of the sender of the speaker identification information and the calling terminal during a call.
Call system with.

It has a call terminal and a speaker identification server,
The call terminal is
A means of calling with other calling terminals,
Biometric information acquisition means for repeatedly acquiring biometric information during a call,
A first transmission means for repeatedly transmitting the biological information or speaker identification information, which is a feature amount extracted from the biological information, to the speaker identification server during a call.
Have,
During a call with a call terminal having a predetermined telephone number, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server.
During a call with a call terminal that does not have a predetermined telephone number, the biometric information acquisition means does not acquire the biometric information, and the first transmission means does not transmit the speaker identification information to the speaker identification server.
The speaker identification server is
A receiving means for receiving the speaker identification information from the calling terminal during a call, and
A speaker identification means for identifying a speaker based on the speaker identification information and pre-registered reference information,
A transmission means for transmitting information indicating the identified speaker to the calling terminal of the sender of the speaker identification information, or to the calling terminal of the sender of the speaker identification information and the calling terminal during a call.
Call system with.

The computer installed in the calling terminal
The call process of talking to other call terminals and
The biometric information acquisition process, which repeatedly acquires biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. Process and
It is a processing method of the calling terminal that executes
The computer
When the call terminal is in an extension call in the call process, the biometric information acquisition step and the first transmission step are executed.
A processing method for a call terminal that does not execute the biometric information acquisition step and the first transmission step when the call terminal is in an outside line call in the call step .

The computer installed in the calling terminal
The call process of talking to other call terminals and
The biometric information acquisition process, which repeatedly acquires biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. Process and
It is a processing method of the calling terminal that executes
The computer
When the call terminal is in a call with a call terminal having a predetermined telephone number in the call process, the biometric information acquisition step and the first transmission step are executed.
A method for processing a call terminal that does not execute the biometric information acquisition step and the first transmission step when the call terminal is in a call with a call terminal that does not have a predetermined telephone number in the call process.

A computer equipped with a calling terminal ,
A means of calling to talk to other calling terminals,
Biometric information acquisition means that repeatedly acquires biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. means,
To function as
While the call terminal is making an extension call, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server.
A program in which the biometric information acquisition means does not acquire the biometric information and the first transmission means does not transmit the speaker identification information to the speaker identification server while the call terminal is on an outside line call .

A computer equipped with a calling terminal,
A means of calling to talk to other calling terminals,
Biometric information acquisition means that repeatedly acquires biometric information during a call,
During a call, the biometric information or the speaker identification information, which is a feature amount extracted from the biometric information, is repeatedly transmitted to the speaker identification server that identifies the speaker based on the speaker identification information. means,
To function as
While the call terminal is talking to a call terminal having a predetermined telephone number, the biometric information acquisition means repeatedly acquires the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server. death,
While the call terminal is talking to a call terminal that does not have a predetermined telephone number, the biometric information acquisition means does not acquire the biometric information, and the first transmission means transmits the speaker identification information to the speaker identification server. Not a program.

The computer installed in the speaker identification server used with the telephone terminal
If the calling terminal is making an extension call
A receiving step of receiving biometric information repeatedly acquired from the calling terminal during a call or speaker identification information which is a feature amount extracted from the biometric information.
A speaker identification step for identifying a speaker based on the speaker identification information received in the reception step and reference information registered in advance, and a speaker identification step.
Information indicating the speaker identified in the speaker identification step is transmitted to the call terminal of the sender of the speaker identification information or the call terminal in talk with the call terminal of the sender of the speaker identification information. Transmission process and
And
If the calling terminal is making an outside line call,
A processing method of a speaker identification server that does not execute the reception step, the speaker identification step, and the transmission step .

The computer installed in the speaker identification server used with the telephone terminal
When the calling terminal is talking to a calling terminal with a predetermined telephone number,
A receiving step of receiving biometric information repeatedly acquired from the call terminal during a call or speaker identification information which is a feature amount extracted from the biometric information.
A speaker identification step for identifying a speaker based on the speaker identification information received in the reception step and reference information registered in advance, and a speaker identification step.
Information indicating the speaker identified in the speaker identification step is transmitted to the call terminal of the sender of the speaker identification information or the call terminal in talk with the call terminal of the sender of the speaker identification information. Transmission process and
And
If the calling terminal is talking to a calling terminal that does not have a predetermined phone number,
A processing method of a speaker identification server that does not execute the reception step, the speaker identification step, and the transmission step.

A computer equipped with a speaker identification server used with a calling terminal ,
A receiving means for receiving biometric information repeatedly acquired during a call or speaker identification information which is a feature amount extracted from the biometric information from the call terminal during a call.
A speaker identification means for identifying a speaker based on the speaker identification information received by the receiving means and the reference information registered in advance.
Information indicating the speaker identified by the speaker identification means is transmitted to the call terminal of the sender of the speaker identification information or the call terminal in talk with the call terminal of the sender of the speaker identification information. Transmission means,
To function as
The receiving means receives the biometric information and the speaker identification information from the call terminal during an extension call, and does not receive the biometric information and the speaker identification information from the call terminal during an outside line call. program.

A computer equipped with a speaker identification server used with a calling terminal,
A receiving means for receiving biometric information repeatedly acquired during a call or speaker identification information which is a feature amount extracted from the biometric information from the call terminal during a call.
A speaker identification means for identifying a speaker based on the speaker identification information received by the receiving means and the reference information registered in advance.
Information indicating the speaker identified by the speaker identification means is transmitted to the call terminal of the sender of the speaker identification information or the call terminal in talk with the call terminal of the sender of the speaker identification information. Transmission means,
To function as
The receiving means receives the biometric information and the speaker identification information from the calling terminal during a call with the calling terminal having a predetermined telephone number, and from the calling terminal during a call with a calling terminal having a non-predetermined telephone number. A program that does not receive the biometric information and the speaker identification information.