JP7500057B2

JP7500057B2 - Communication management device and method

Info

Publication number: JP7500057B2
Application number: JP2020112961A
Authority: JP
Inventors: 篤掛村; 涼太吉澤; 祐太郎佐伯
Original assignee: ボイット株式会社
Priority date: 2020-01-27
Filing date: 2020-06-30
Publication date: 2024-06-17
Anticipated expiration: 2040-06-30
Also published as: JP2021117965A

Description

本発明の実施形態は、音声及びテキストを使用したコミュニケーション（認識共有、意思疎通など）支援技術に関する。 An embodiment of the present invention relates to technology that supports communication (sharing awareness, communicating, etc.) using voice and text.

音声コミュニケーションの一例として、トランシーバ(transceiver)がある。トランシーバは、無線電波の送信機能と受信機能を兼ね備えた無線機であり、１人のユーザが複数人のユーザと通話（一方向又は双方向の情報伝達）を行うことができる。トランシーバの活用例は、工事現場やイベント会場、ホテルや旅館などの施設等で目にすることができる。また、タクシー無線もトランシーバ活用の一例として挙げることができる。 One example of voice communication is a transceiver. A transceiver is a wireless device that has both radio wave transmission and reception functions, allowing one user to talk to multiple users (one-way or two-way information transmission). Examples of transceivers in use can be seen at construction sites, event venues, hotels, inns, and other facilities. Another example of a transceiver in use is a taxi radio.

特開２０１３－１８７５９９号公報JP 2013-187599 A

状態又は状況変化を伝達するエージェントを含むコミュニケーショングループを形成し、複数のユーザ間での情報伝達を支援するコミュニケーションシステムを提供することを目的とする。 The objective is to provide a communication system that forms a communication group including agents that communicate status or situation changes, and supports the communication of information between multiple users.

実施形態のコミュニケーションシステムは、複数の各ユーザがそれぞれ携帯する移動通信端末を通じて、ユーザの発話音声が他のユーザの移動通信端末に同報配信する。コミュニケーションシステムは、前記各移動通信端末が無線通信で接続するコミュニケーション管理装置と、監視対象の状態検出機器から出力される検出情報が入力され、前記コミュニケーション管理装置に接続するエージェント装置と、を含む。前記コミュニケーション管理装置は、移動通信端末から受信した発話音声データを他の複数の移動通信端末それぞれに同報配信する第１制御部と、受信した発話音声データを音声認識処理して得られる発話音声認識結果を、ユーザ同士のコミュニケーション履歴として時系列に蓄積するとともに、前記各移動通信端末において前記コミュニケーション履歴が同期して表示されるようにテキスト配信制御を行う第２制御部と、を有するコミュニケーション制御部を備える。前記エージェント装置は、前記検出情報に基づくエージェント発話テキストを生成し、前記コミュニケーション管理装置に送信する発話テキスト送信部を備える。そして、前記コミュニケーション制御部は、合成音声処理によって生成された前記エージェント発話テキストの合成音声データを、複数の前記移動通信端末それぞれに同報配信するとともに、受信した前記エージェント発話テキストを、ユーザ同士の前記コミュニケーション履歴に含ませて時系列に蓄積し、前記各移動通信端末へのテキスト配信制御を行う。 In the communication system of the embodiment, a user's speech is broadcast to the mobile communication terminals of other users through a mobile communication terminal carried by each of the users. The communication system includes a communication management device to which each of the mobile communication terminals is connected by wireless communication, and an agent device to which detection information output from a state detection device to be monitored is input and connected to the communication management device. The communication management device includes a communication control unit having a first control unit that broadcasts speech data received from the mobile communication terminal to each of the other multiple mobile communication terminals, and a second control unit that accumulates the speech recognition results obtained by speech recognition processing of the received speech data in chronological order as a communication history between users and controls text distribution so that the communication history is displayed synchronously on each of the mobile communication terminals. The agent device includes a speech text transmission unit that generates an agent speech text based on the detection information and transmits it to the communication management device. The communication control unit then broadcasts the synthetic voice data of the agent utterance text generated by the synthetic voice processing to each of the multiple mobile communication terminals, and stores the received agent utterance text in chronological order by including it in the communication history between users, and controls the text distribution to each of the mobile communication terminals.

第１実施形態のコミュニケーションシステムのネットワーク構成図である。FIG. 1 is a network configuration diagram of a communication system according to a first embodiment. 第１実施形態のコミュニケーション管理装置、エージェント装置、ユーザ端末の各構成ブロック図である。1 is a block diagram illustrating the configuration of a communication management device, an agent device, and a user terminal according to a first embodiment. 第１実施形態のユーザ情報及びグループ情報の一例を示す図である。5A and 5B are diagrams illustrating an example of user information and group information according to the first embodiment. 第１実施形態のユーザ端末に表示される画面例である。5 is an example of a screen displayed on a user terminal according to the first embodiment. 第１実施形態の設定管理情報の一例を示す図である。FIG. 4 is a diagram illustrating an example of setting management information according to the first embodiment. 第１実施形態のコミュニケーションシステムの処理フローを示す図である。FIG. 2 is a diagram showing a processing flow of the communication system of the first embodiment. 第１実施形態のコミュニケーションシステムの第１事例に基づく処理フローを示す図である。FIG. 2 is a diagram showing a processing flow based on a first example of the communication system of the first embodiment. 第２実施形態のコミュニケーションシステムのネットワーク構成図である。FIG. 11 is a network configuration diagram of a communication system according to a second embodiment. 第２実施形態のコミュニケーション管理装置、エージェント装置、ユーザ端末の各構成ブロック図である。FIG. 11 is a block diagram showing the configuration of a communication management device, an agent device, and a user terminal according to a second embodiment. 第２実施形態のコミュニケーションシステムの第２事例に基づく処理フローを示す図である。FIG. 11 is a diagram showing a processing flow based on a second example of the communication system of the second embodiment. 第２実施形態のユーザ端末に表示される画面例である。13 is an example of a screen displayed on a user terminal according to the second embodiment. 第３実施形態のグループ通話モード中の個別通話モード割り込み処理の一例を説明するための図である。FIG. 13 is a diagram for explaining an example of an individual call mode interrupt process during a group call mode in the third embodiment. 第３実施形態のコミュニケーション管理装置、エージェント装置、ユーザ端末の各構成ブロック図である。FIG. 13 is a block diagram showing the configuration of a communication management device, an agent device, and a user terminal according to a third embodiment. 第３実施形態の特定通知設定情報の一例を示す図である。FIG. 13 is a diagram illustrating an example of specific notification setting information according to the third embodiment. 第３実施形態のコミュニケーションシステムの第３事例に基づく処理フローを示す図である。FIG. 13 is a diagram showing a processing flow based on a third case of the communication system of the third embodiment.

（第１実施形態）
図１から図７は、第１実施形態を説明するための図である。図１は、本実施形態に係るコミュニケーションシステムのネットワーク構成図である。コミュニケーションシステムは、コミュニケーション管理装置（以下、管理装置と称する）１００を中心に、音声及びテキストを用いた情報伝達支援機能を提供する。以下では、施設管理を一例に、コミュニケーションシステムを適用した態様について説明する。 First Embodiment
1 to 7 are diagrams for explaining the first embodiment. FIG. 1 is a network configuration diagram of a communication system according to this embodiment. The communication system provides an information transmission support function using voice and text, centered around a communication management device (hereinafter referred to as a management device) 100. Below, an aspect in which the communication system is applied will be described, taking facility management as an example.

管理装置１００は、複数の各ユーザがそれぞれ携帯するユーザ端末（移動通信端末）５００が無線通信で接続し、ユーザの発話音声を他のユーザ端末５００に同報配信する。 The management device 100 is connected via wireless communication to user terminals (mobile communication terminals) 500 carried by each of a number of users, and broadcasts the user's speech to the other user terminals 500.

ユーザ端末５００は、例えば、スマートフォンなどの多機能携帯電話機やＰＤＡ(Personal Digital Assistant)、タブレット型端末などの持ち運び可能な携帯端末（モバイル端末）である。ユーザ端末５００は、通信機能、演算機能及び入力機能を備え、ＩＰ（Internet protocol）網又は移動通信回線網（Mobile communication network）を通じて無線通信で管理装置１００と接続し、データ通信を行う。 The user terminal 500 is, for example, a portable mobile terminal such as a multi-function mobile phone such as a smartphone, a PDA (Personal Digital Assistant), or a tablet terminal. The user terminal 500 has a communication function, a calculation function, and an input function, and connects to the management device 100 via wireless communication through an IP (Internet protocol) network or a mobile communication network to perform data communication.

ユーザの発話音声が他の複数のユーザ端末５００に同報配信される範囲（又は後述するコミュニケーション履歴が同期して表示される範囲）は、コミュニケーショングループとして設定され、対象ユーザ（現場ユーザ）のユーザ端末５００それぞれが登録される。そして、図１に示すように、本実施形態では、施設管理における監視対象の状態検出機器（センサ機器１）から出力される検出情報が入力され、無線通信又は有線通信で管理装置１００に接続するエージェント装置３００が、複数のユーザが登録されるコミュニケーショングループの一員（エージェント）として登録される。 The range in which the user's speech is broadcast to multiple other user terminals 500 (or the range in which the communication history, described later, is displayed synchronously) is set as a communication group, and each of the user terminals 500 of the target users (on-site users) is registered. As shown in FIG. 1, in this embodiment, the agent device 300 that receives detection information output from a state detection device (sensor device 1) of a monitored object in facility management and connects to the management device 100 by wireless or wired communication is registered as a member (agent) of the communication group in which multiple users are registered.

監視対象が、温泉である場合、温泉の状態とは、例えば、温度である。この場合、状態検出機器は、温度センサ１等の計測機器である。温度センサ１は、検出情報として検出温度をエージェント装置３００に出力する。エージェント装置３００は、検出温度が入力されると、検出温度に基づくエージェント発話テキストを生成して管理装置１００に送信する。つまり、エージェント装置３００は、ユーザ端末５００を携帯するユーザと同様の、コミュニケーショングループ内の一員として、検出情報に基づく発話を行う機器であり、状態検出機器に代わって発話する発話代理者として位置付けられる。 When the object to be monitored is a hot spring, the state of the hot spring is, for example, the temperature. In this case, the state detection device is a measuring device such as a temperature sensor 1. The temperature sensor 1 outputs the detected temperature to the agent device 300 as detection information. When the detected temperature is input, the agent device 300 generates an agent utterance text based on the detected temperature and transmits it to the management device 100. In other words, the agent device 300 is a device that makes utterances based on the detection information as a member of a communication group, similar to the user who carries the user terminal 500, and is positioned as a speech agent that speaks on behalf of the state detection device.

エージェント装置３００は、デスクトップ型コンピュータやタブレット型コンピュータ、ラップトップ型コンピュータである。エージェント装置３００は、ＩＰ網又は移動通信回線網を通じ、無線通信又は有線通信でのデータ通信機能及び演算機能（ＣＰＵ等）を備えている。また、エージェント装置３００は、ディスプレイ装置（又はタッチパネル方式の表示装置）及び文字入力手段を備えるように構成することもできる。また、エージェント装置３００は、本実施形態の各機能を備えた専用機器であってもよい。 The agent device 300 is a desktop computer, tablet computer, or laptop computer. The agent device 300 has a data communication function for wireless or wired communication via an IP network or a mobile communication line network, and a calculation function (CPU, etc.). The agent device 300 can also be configured to have a display device (or a touch panel type display device) and character input means. The agent device 300 may also be a dedicated device equipped with each of the functions of this embodiment.

本実施形態のコミュニケーションシステムは、複数の各ユーザがハンズフリーで対話を行うことができることを前提とした、認識共有や意思疎通のための情報伝達を支援する。それとともに、施設管理における監視対象の状態又は状況変化を伝達するエージェントを含むコミュニケーショングループを形成し、エージェントの発話機能によって今まで人手で行っていた監視対象の状態又は状況変化の情報取得及び伝達の効率化を支援する。 The communication system of this embodiment supports the transmission of information for sharing awareness and communicating, on the premise that multiple users can converse hands-free. At the same time, a communication group is formed including an agent that communicates the status or changes in situation of monitored objects in facility management, and the agent's speech function helps to improve the efficiency of obtaining and transmitting information on the status or changes in situation of monitored objects, which was previously done manually.

特に、施設の設備管理は、人的な労働力で支えられており、設備機器を操作・制御する作業が必ず存在する。このような設備機器の操作・制御は、設備機器の状態や状況を継続的に確認して行う必要があり、そのためには、ユーザが、設備機器を訪れて状況を確認したり、状態検出機器の設置場所に訪れて検出情報を確認したりしなければならない。このため、多大な労力が必要であった。一方で、近年は、ＩｏＴ（ＩｎｔｅｒｎｅｔｏｆＴｈｉｎｇｓ）化によるセンサ機器と、設備機器の操作や制御等との連携が注目されているが、コスト面の課題等により、上述のように、人的な労力により支えられているのが実情である。 In particular, facility equipment management is supported by human labor, and there is always work to operate and control equipment. Such equipment operation and control requires continuous checking of the equipment's status and condition, which requires the user to visit the equipment to check its status or to visit the installation site of the status detection device to check the detection information. This requires a great deal of labor. Meanwhile, in recent years, attention has been focused on the linkage between sensor devices and the operation and control of equipment through the Internet of Things (IoT), but due to cost issues and other factors, as mentioned above, the reality is that this is supported by human labor.

本実施形態では、設備機器の操作・制御を人手で行う際に、設備機器の状態や状況を把握するための検出情報を出力するセンサ機器等がユーザコミュニケーショングループの一員として検出情報に基づく発話を行う仕組みを導入して、ユーザの労力の軽減を実現する。これと共に、既設のセンサ機器等の状態検出機器に対し、検出情報を受信するエージェント装置３００を設備管理の現場に設置するだけでユーザコミュニケーショングループに容易に参加することができるシンプルで低コストなシステム構成を実現する。 In this embodiment, when equipment is operated and controlled manually, a mechanism is introduced in which sensor devices, etc. that output detection information to grasp the state and situation of the equipment, make speech based on the detection information as a member of a user communication group, thereby reducing the labor of the user. At the same time, a simple and low-cost system configuration is realized in which existing sensor devices and other status detection devices can easily join a user communication group simply by installing an agent device 300 that receives detection information at the facility management site.

図２は、管理装置１００、エージェント装置３００、ユーザ端末５００の各構成ブロック図である。 Figure 2 is a block diagram of the configuration of the management device 100, the agent device 300, and the user terminal 500.

管理装置１００は、制御装置１１０、記憶装置１２０及び通信装置１３０を含む。通信装置１３０は、複数の各ユーザ端末５００との間の通信接続管理及びデータ通信制御を行い、同じ内容の発話音声及び発話テキストを複数の各ユーザ端末５００に一斉に送る同報配信通信制御を行う。 The management device 100 includes a control device 110, a storage device 120, and a communication device 130. The communication device 130 manages communication connections and controls data communications between the multiple user terminals 500, and controls broadcast communications to simultaneously send the same spoken voice and spoken text to the multiple user terminals 500.

制御装置１１０は、ユーザ管理部１１１、コミュニケーション制御部１１２、音声認識部１１３及び音声合成部１１４を含んで構成されている。記憶装置１２０は、ユーザ情報１２１、グループ情報１２２、コミュニケーション履歴（コミュニケーションログ）情報１２３、音声認識辞書１２４、及び音声合成辞書１２５を含んで構成されている。 The control device 110 includes a user management unit 111, a communication control unit 112, a voice recognition unit 113, and a voice synthesis unit 114. The storage device 120 includes user information 121, group information 122, communication history (communication log) information 123, a voice recognition dictionary 124, and a voice synthesis dictionary 125.

エージェント装置３００は、管理対象の設備に設けられた状態検出装置（センサ機器１）と無線又は有線で接続され、通信部３１０を介して状態検出装置から出力される検出情報を受け付けるセンサ情報取得部３２０を備えている。また、制御部（判定部）３３０、発話テキスト送信部３４０、設定管理部３５０、及び記憶部３６０を備えている。 The agent device 300 is connected wirelessly or wired to a status detection device (sensor device 1) installed in the equipment to be managed, and includes a sensor information acquisition unit 320 that receives detection information output from the status detection device via a communication unit 310. The agent device 300 also includes a control unit (determination unit) 330, a spoken text transmission unit 340, a setting management unit 350, and a storage unit 360.

ユーザ端末５００は、通信・通話部５１０、コミュニケーションＡｐｐ制御部５２０、マイク５３０、スピーカー５４０、タッチパネル等の表示入力部５５０、及び記憶部５６０を含んで構成されている。なお、スピーカー５４０は、実際には、イヤホンやヘッドホン（有線又はワイヤレス）などで構成される。 The user terminal 500 includes a communication/call unit 510, a communication App control unit 520, a microphone 530, a speaker 540, a display input unit 550 such as a touch panel, and a memory unit 560. Note that the speaker 540 is actually composed of earphones or headphones (wired or wireless), etc.

図３は、各種情報の一例を示す図であり、ユーザ情報１２１は、本コミュニケーションシステムを利用するユーザ登録情報である。ユーザ管理部１１１は、所定の管理画面を通じて、ユーザＩＤ、ユーザ名、属性、グループを設定することができるように制御する。また、エージェント装置３００もユーザ登録される。グループ情報１２２は、コミュニケーショングループを区画するグループ識別情報である。コミュニケーショングループＩＤ別に伝達情報の送受信及び同報配信を制御し、異なるコミュニケーショングループ間で情報が混在しないように制御される。ユーザ情報１２１において、グループ情報１２２に登録されたコミュニケーショングループを、各ユーザに紐付けることができる。 Figure 3 is a diagram showing an example of various information, where user information 121 is user registration information for users who use this communication system. The user management unit 111 controls the setting of user IDs, user names, attributes, and groups through a specified management screen. The agent device 300 is also user-registered. Group information 122 is group identification information that partitions communication groups. The sending and receiving of communication information and broadcast distribution are controlled by communication group ID, and information is controlled to not be mixed between different communication groups. In the user information 121, the communication groups registered in the group information 122 can be linked to each user.

本実施形態のユーザ管理部１１１は、後述する第１制御（発話音声データの同報配信）及び第２制御（エージェント発話テキストまたは／及びユーザの発話音声認識結果のテキスト同報配信）の対象となる、複数のユーザが登録されたコミュニケーショングループを設定する機能を提供しつつ、エージェント装置３００をコミュニケーショングループに登録する機能を提供する。 The user management unit 111 of this embodiment provides a function to set up a communication group in which multiple users are registered, and which is subject to the first control (broadcast distribution of spoken voice data) and the second control (broadcast distribution of agent spoken text and/or text of user spoken voice recognition results) described below, and also provides a function to register the agent device 300 in the communication group.

なお、グループ分けについては、本実施形態のコミュニケーションシステムを導入する施設等に応じて施設を複数の部門に分割して管理することもできる。例えば、宿泊施設を一例に説明すると、ベルパーソン（荷物運び）、コンシェルジュ、ハウスキーピング（清掃）をそれぞれ異なるグループに設定し、客室管理をそれぞれのグループ毎に細分化したコミュニケーション環境を構築することもできる。他の観点として、役割的にコミュニケーションが不要なケースも考えられる。例えば、料理の配膳係と、ベルパーソン（荷物運び）は、直接コミュニケーションをとる必要がないのでグループを分けることができる。また、地理的にコミュニケーションが不要なケースも考えられ、例えば、Ａ支店、Ｂ支店などが地理的に離れており、かつ頻繁にコミュニケーションをする必要がない場合などは、グループを分けることができる。 Regarding grouping, a facility can be divided into multiple departments for management depending on the facility in which the communication system of this embodiment is introduced. For example, in the case of an accommodation facility, bellpersons (luggage carriers), concierges, and housekeepers (cleaning) can each be set in different groups, and a communication environment can be created in which guest room management is subdivided for each group. From another perspective, there may be cases where communication is not required due to roles. For example, food servers and bellpersons (luggage carriers) can be separated into groups because they do not need to communicate directly. There may also be cases where communication is not required due to geographical reasons. For example, if branch A and branch B are geographically separated and there is no need for frequent communication, the groups can be separated.

したがって、エージェント装置３００が登録されたコミュニケーショングループ、エージェント装置３００が登録されていないコミュニケーショングループ、複数のエージェント装置３００が登録されたコミュニケーショングループなど、様々なコミュニケーショングループを混在して設定することができる。施設内に管理対象の設備が複数存在する場合は、各設備別にエージェント装置３００を個別に設置することができ、また、同じ設備であっても、状態検出機器が複数設置されている場合は、各状態検出機器に対してエージェント装置３００をそれぞれ設置し、１つのコミュニケーショングループに登録することもできる。 Therefore, it is possible to set up a mixture of various communication groups, such as a communication group in which an agent device 300 is registered, a communication group in which an agent device 300 is not registered, and a communication group in which multiple agent devices 300 are registered. If there are multiple pieces of equipment to be managed within a facility, an agent device 300 can be installed separately for each piece of equipment. Also, if multiple status detection devices are installed in the same facility, an agent device 300 can be installed for each status detection device and registered in one communication group.

管理装置１００のコミュニケーション制御部１１２は、第１制御部と第２制御部の各制御部として機能する。第１制御部は、ユーザ端末５００から受信した発話音声データを他の複数のユーザ端末５００それぞれに同報配信制御を行う。第２制御部は、受信した発話音声データを音声認識処理して得られる発話音声認識結果を、ユーザ同士のコミュニケーション履歴１２３として時系列に蓄積するとともに、各ユーザ端末５００においてコミュニケーション履歴１２３が同期して表示されるようにテキスト配信制御を行う。 The communication control unit 112 of the management device 100 functions as each of the first and second control units. The first control unit controls the broadcast distribution of the speech data received from the user terminal 500 to each of the other multiple user terminals 500. The second control unit accumulates the speech recognition results obtained by processing the received speech data into speech recognition data in chronological order as the communication history 123 between users, and controls text distribution so that the communication history 123 is displayed synchronously on each user terminal 500.

第１制御部としての機能は、発話音声データの同報配信である。発話音声データには、テキスト（例えば、エージェント発話テキスト）から音声合成処理によって人工的に生成された音声データと、ユーザが発声した音声データとが含まれる。音声合成部１１４は、音声合成辞書１２５を用いて、エージェント発話テキストの文字に対応する音声データを合成し、音声合成データを生成する。このとき、音声合成データを構成する音声データの素材は、任意である。 The function of the first control unit is to broadcast speech data. The speech data includes speech data artificially generated by speech synthesis processing from text (e.g., agent utterance text) and speech data uttered by the user. The speech synthesis unit 114 uses the speech synthesis dictionary 125 to synthesize speech data corresponding to the characters of the agent utterance text, generating speech synthesis data. At this time, the material of the speech data that constitutes the speech synthesis data is arbitrary.

第２制御部としての機能は、エージェント発話テキスト及びユーザの発話音声認識結果のテキスト同報配信である。本実施形態では、ユーザ端末５００において入力された音声及びユーザ端末５００において再生される音声は、すべてテキスト化されてコミュニケーション履歴１２３に時系列に蓄積され、各ユーザ端末５００において同期して表示されるように制御される。音声認識部１１３は、音声認識辞書１２４を用いて音声認識処理を行い発話音声認識結果としてテキストデータを出力する。音声認識処理については公知の技術を適用することができる。 The function of the second control unit is to broadcast the agent's spoken text and the user's spoken voice recognition results. In this embodiment, all voices input to the user terminal 500 and voices played back on the user terminal 500 are converted to text and stored in chronological order in the communication history 123, and are controlled so that they are displayed synchronously on each user terminal 500. The voice recognition unit 113 performs voice recognition processing using the voice recognition dictionary 124, and outputs text data as the spoken voice recognition result. Publicly known technology can be applied for the voice recognition processing.

そして、エージェント装置３００は、状態検出機器から出力される検出情報に基づくエージェント発話テキストを生成し、管理装置１００に送信する発話テキスト送信部３４０を備えている。管理装置１００のコミュニケーション制御部１１２は、第１制御の機能として、発話テキスト送信部３４０から受信したエージェント発話テキストに対して合成音声処理を行ってエージェント発話テキストの合成音声データを生成して、複数のユーザ端末５００それぞれに同報配信する。それとともに、第２制御の機能として、発話テキスト送信部３４０から受信したエージェント発話テキストを、ユーザ同士のコミュニケーション履歴１２３に含ませて時系列に蓄積し、各ユーザ端末５００へのテキスト配信制御を行う。 The agent device 300 is equipped with a speech text sending unit 340 that generates agent utterance text based on the detection information output from the state detection device and sends it to the management device 100. As a first control function, the communication control unit 112 of the management device 100 performs synthetic speech processing on the agent utterance text received from the speech text sending unit 340 to generate synthetic speech data of the agent utterance text, and distributes it to each of the multiple user terminals 500. At the same time, as a second control function, the agent utterance text received from the speech text sending unit 340 is included in the communication history 123 between users, accumulated in chronological order, and controls the distribution of text to each user terminal 500.

なお、コミュニケーション履歴情報１２３は、各ユーザの発話内容とエージェント装置３００のエージェント発話テキストとが時間情報と共に、テキストベースで時系列に蓄積されたログ情報である。各テキストに対応する音声データは、音声ファイルとして所定の記憶領域に格納しておくことができ、例えば、コミュニケーション履歴１２３には、音声ファイルの格納場所を記録する。コミュニケーション履歴情報１２３は、コミュニケーショングループ別にそれぞれ生成され、蓄積される。 The communication history information 123 is text-based log information in which the content of each user's utterance and the agent utterance text of the agent device 300 are accumulated in chronological order along with time information. The audio data corresponding to each text can be stored as an audio file in a specified storage area, and the communication history 123 records, for example, the storage location of the audio file. The communication history information 123 is generated and accumulated for each communication group.

図４は、各ユーザ端末５００で表示されるコミュニケーション履歴１２３の一例を示す図である。ユーザ端末５００それぞれは、管理装置１００からリアルタイムに又は所定のタイミングでコミュニケーション履歴１２３を受信し、複数のユーザ間で表示同期が取られ、時系列に過去のコミュニケーションログを参照することができる。 Figure 4 is a diagram showing an example of the communication history 123 displayed on each user terminal 500. Each user terminal 500 receives the communication history 123 from the management device 100 in real time or at a specified timing, and the display is synchronized between multiple users, allowing past communication logs to be referenced in chronological order.

なお、表示欄Ｄにおいて、合成音声データに対応するテキストには、音声マークＭを表示したり、発話者自身の発話テキストには、表示欄ＤにおいてマイクマークＨを表示したりすることができる。 In addition, in the display field D, a voice mark M can be displayed for text that corresponds to the synthetic voice data, and a microphone mark H can be displayed for the speaker's own spoken text in the display field D.

図４の例のように、各ユーザ端末５００は、自分の発話内容及び自分以外の他のユーザの発話内容と共に、エージェント装置３００の発話内容が表示欄Ｄに時系列に表示され、管理装置１００に蓄積されるコミュニケーション履歴１２３がログ情報として共有される。 As shown in the example of FIG. 4, each user terminal 500 displays the content of the agent device 300's speech in chronological order in the display field D along with the user's own speech and the speech of other users, and the communication history 123 accumulated in the management device 100 is shared as log information.

図５は、エージェント装置３００で使用される設定管理情報の一例を示す図である。エージェント装置３００が発話を行う条件及び発話テキストの内容が設定管理情報として登録されている。制御部３３０は、設定管理情報において設定された判定条件に基づいて、検出情報が判定条件を満たすか否かを判断する判定部として機能する。 Figure 5 is a diagram showing an example of setting management information used by the agent device 300. The conditions under which the agent device 300 will speak and the contents of the spoken text are registered as setting management information. The control unit 330 functions as a judgment unit that judges whether or not the detection information satisfies the judgment conditions based on the judgment conditions set in the setting management information.

図５の例では、「設定１」において、条件：温度が３６度未満、エージェント発話テキスト「温度が３６度を下回りました」が設定されている。「設定２」では、条件：温度が４２度以上、エージェント発話テキスト「温度が４２度を超えました」が設定されている。制御部３３０は、センサ情報取得部３２０によって任意の時間間隔で取得される検出情報を用いて、設定管理情報に設定された判定条件とマッチングし、判定条件を満たすか否かを判定する。 In the example of FIG. 5, in "setting 1," the condition is set as follows: temperature is less than 36 degrees, and the agent utterance text is "The temperature has fallen below 36 degrees." In "setting 2," the condition is set as follows: temperature is 42 degrees or more, and the agent utterance text is "The temperature has exceeded 42 degrees." The control unit 330 uses the detection information acquired at any time interval by the sensor information acquisition unit 320 to match it with the judgment conditions set in the setting management information, and determines whether the judgment conditions are met.

発話テキスト送信部３４０は、制御部３３０によって判定条件を満たすと判定された場合、設定管理情報の発話テキストを抽出してエージェント発話テキストデータを生成し、管理装置１００に送信する。 When the control unit 330 determines that the judgment condition is satisfied, the spoken text sending unit 340 extracts the spoken text from the setting management information, generates agent spoken text data, and sends it to the management device 100.

設定管理情報は、エージェント装置３００が備える管理情報登録画面を通じて入力したり、互いに異なる判定条件及び発話テキストの複数のペアが記録された設定管理情報ファイルを他のコンピュータ装置で作成し、エージェント装置３００に記憶させたりすることができる。 The setting management information can be input through a management information registration screen provided on the agent device 300, or a setting management information file in which multiple pairs of different judgment conditions and speech texts are recorded can be created on another computer device and stored in the agent device 300.

図６は、本実施形態のコミュニケーションシステムの処理フローを示す図である。 Figure 6 shows the processing flow of the communication system of this embodiment.

各ユーザは、ユーザ端末５００において、コミュニケーションＡｐｐ制御部５２０を起動し、コミュニケーションＡｐｐ制御部５２０が管理装置１００との接続処理を行う。そして、所定のログイン画面から自分のユーザＩＤ及びパスワードを入力して管理装置１００にログインする。ログイン認証処理は、ユーザ管理部１１１によって遂行される。ログイン後の各ユーザ端末５００は、任意のタイミングで又は所定の時間間隔で、管理装置１００との間で情報取得処理を行う。 Each user starts the communication app control unit 520 on the user terminal 500, and the communication app control unit 520 performs connection processing with the management device 100. Then, the user enters his/her user ID and password on a specific login screen to log in to the management device 100. The login authentication processing is performed by the user management unit 111. After logging in, each user terminal 500 performs information acquisition processing with the management device 100 at any timing or at a specific time interval.

ユーザＡが発話すると、コミュニケーションＡｐｐ制御部５２０は、発話音声を集音し、発話音声データを管理装置１００に送信する（Ｓ５０１ａ）。管理装置１００の音声認識部１１３は、受信した発話音声データを音声認識処理し（Ｓ１０１）、発話内容の音声認識結果を出力する。コミュニケーション制御部１１２は、音声認識結果をコミュニケーション履歴１２３に記憶し、発話音声データを記憶装置１２０に記憶する（Ｓ１０２）。 When user A speaks, the communication app control unit 520 collects the spoken voice and transmits the spoken voice data to the management device 100 (S501a). The voice recognition unit 113 of the management device 100 performs voice recognition processing on the received spoken voice data (S101) and outputs the voice recognition result of the spoken content. The communication control unit 112 stores the voice recognition result in the communication history 123 and stores the spoken voice data in the storage device 120 (S102).

コミュニケーション制御部１１２は、発話したユーザＡ以外の他のユーザ端末５００それぞれにユーザＡの発話音声データを同報送信する。また、コミュニケーション履歴１２３に記憶したユーザＡの発話内容（テキスト）を、表示同期のために、ユーザＡ自身を含むコミュニケーショングループ内の各ユーザ端末５００に送信する（Ｓ１０３）。 The communication control unit 112 broadcasts the speech data of user A to each of the other user terminals 500 other than the user A who made the speech. In addition, the communication control unit 112 transmits the speech content (text) of user A stored in the communication history 123 to each user terminal 500 in the communication group including user A himself/herself for display synchronization (S103).

ユーザＡ以外の各ユーザ端末５００のコミュニケーションＡｐｐ制御部５２０は、受信した発話音声データの自動再生処理を行い、発話音声出力を行いつつ（Ｓ５０２ｂ，Ｓ５０２ｃ）、音声出力された発話音声に対応するテキスト形式の発話内容を表示欄Ｄに表示させる。 The communication app control unit 520 of each user terminal 500 other than user A performs an automatic playback process of the received speech voice data, and while outputting the speech voice (S502b, S502c), displays the speech content in text format corresponding to the speech voice output in the display field D.

続いて、エージェント装置３００は、状態検出機器から出力される検出情報を監視し、検出情報が判定条件を満たすと判別された場合、発話テキスト送信部３４０は、判定結果に基づいてエージェント発話テキストを生成し、管理装置１００に送信する（Ｓ３０１）。 Next, the agent device 300 monitors the detection information output from the status detection device, and if it is determined that the detection information satisfies the judgment condition, the speech text sending unit 340 generates agent utterance text based on the judgment result and sends it to the management device 100 (S301).

このとき、エージェント発話テキストには、センサ値等の検出情報が含まれていてもいなくてもよい。つまり、判定条件を満たす状態であることを通知する内容であればよく、例えば、「温度が下がってきました」、「温度が高すぎます」などのセンサ値自体を含まない発話テキストであってもよい。また、「温度が３６度を下回りました。現在の温度は、３５.１度です」のように、センサ値を含むように、エージェント発話テキストを生成することもできる。実測値を含むことで、緊急対応しなければならないのか、対応までに時間的猶予があるのかなどをユーザに知らせることができる。 At this time, the agent utterance text may or may not include detection information such as sensor values. In other words, it is sufficient if the content notifies that the state satisfies the judgment condition, and it may be utterance text that does not include the sensor value itself, such as "The temperature is dropping" or "The temperature is too high." In addition, the agent utterance text can be generated to include the sensor value, such as "The temperature has dropped below 36 degrees. The current temperature is 35.1 degrees." By including the actual measured value, it is possible to inform the user whether emergency action is required or whether there is time to respond.

管理装置１００のコミュニケーション制御部１１２は、受信したエージェント発話テキストをコミュニケーション履歴１２３に記憶し（Ｓ１０４）、音声合成部１１４は、エージェント発話テキストに対応する合成音声を生成し（Ｓ１０５）、生成した合成音声データを記憶装置１２０に記憶する。 The communication control unit 112 of the management device 100 stores the received agent utterance text in the communication history 123 (S104), and the voice synthesis unit 114 generates synthetic voice corresponding to the agent utterance text (S105), and stores the generated synthetic voice data in the storage device 120.

コミュニケーション制御部１１２は、コミュニケーショングループに登録されたすべてのユーザ端末５００それぞれにエージェント装置３００の発話音声データを同報送信する。また、コミュニケーション履歴１２３に記憶したエージェント発話テキストを、表示同期のために、コミュニケーショングループ内の各ユーザ端末５００に送信する（Ｓ１０６）。 The communication control unit 112 broadcasts the spoken voice data of the agent device 300 to each of all user terminals 500 registered in the communication group. It also transmits the agent spoken text stored in the communication history 123 to each user terminal 500 in the communication group for display synchronization (S106).

各ユーザ端末５００のコミュニケーションＡｐｐ制御部５２０は、受信したエージェントの発話音声データの自動再生処理を行い、発話音声出力を行いつつ（Ｓ５０３ａ，Ｓ５０３ｂ，Ｓ５０３ｃ）、発話音声に対応するテキスト形式のエージェント発話内容を表示欄Ｄに表示させる。 The communication app control unit 520 of each user terminal 500 performs automatic playback processing of the received agent's speech voice data, and while outputting the speech voice (S503a, S503b, S503c), displays the agent's speech content in text format corresponding to the speech voice in the display field D.

図７は、本実施形態のコミュニケーションシステムが適用された第１事例に基づく処理フローを示す図である。 Figure 7 shows the processing flow based on the first example in which the communication system of this embodiment is applied.

図７に示すように、エージェント装置３００のセンサ情報取得部３２０は、任意のタイミング又は所定の時間間隔で、状態検出機器（センサ機器１）が出力する温泉の温度情報を取得する（Ｓ３００１）。制御部３３０は、温泉情報が取得される度に、温泉の温度が設定管理情報として登録された判定条件を満たすか否かの判定処理を行う（Ｓ３００２）。 As shown in FIG. 7, the sensor information acquisition unit 320 of the agent device 300 acquires hot spring temperature information output by the status detection device (sensor device 1) at any timing or at a predetermined time interval (S3001). Each time hot spring information is acquired, the control unit 330 performs a process of determining whether the hot spring temperature satisfies the judgment condition registered as the setting management information (S3002).

温泉の温度が、判定条件を満たす温度である場合（Ｓ３００３のＹＥＳ）、発話テキスト送信部３４０は、設定管理情報に設定されている発話テキストを抽出し、エージェント発話テキストデータ「温度が３６度を下回りました」を生成する（Ｓ３００４）。発話テキスト送信部３４０は、生成したエージェント発話テキストを管理装置１００に送信する（Ｓ３００５）。 If the temperature of the hot spring meets the judgment condition (YES in S3003), the speech text sending unit 340 extracts the speech text set in the setting management information and generates agent speech text data "The temperature has fallen below 36 degrees" (S3004). The speech text sending unit 340 sends the generated agent speech text to the management device 100 (S3005).

管理装置１００の音声合成部１１４は、受信したエージェント発話テキストの合成音声データを生成する（Ｓ１００１）。また、管理装置１００のコミュニケーション制御部１１２は、エージェント装置３００から受信したエージェント発話テキストを、ユーザ同士のコミュニケーション履歴１２３に含ませて時系列に記憶する（Ｓ１００２）。 The speech synthesis unit 114 of the management device 100 generates synthetic speech data of the received agent utterance text (S1001). In addition, the communication control unit 112 of the management device 100 stores the agent utterance text received from the agent device 300 in chronological order by including it in the communication history 123 between users (S1002).

コミュニケーション制御部１１２は、表示同期のため、テキスト形式のエージェント発話テキストを、ユーザ端末５００に送信すると共に（Ｓ１００３）、エージェント発話内容の合成音声データを複数の各ユーザ端末５００に同報配信する（Ｓ１００４）。 To synchronize the display, the communication control unit 112 transmits the agent utterance text in text format to the user terminal 500 (S1003), and simultaneously distributes the synthesized voice data of the agent utterance content to each of the multiple user terminals 500 (S1004).

各ユーザ端末５００のコミュニケーションＡｐｐ制御部５２０は、テキスト形式のエージェント発話内容を表示欄Ｄに表示させ、かつ合成音声データの自動再生処理を行い、音声出力を行う。このとき、各ユーザ端末５００の表示欄Ｄにおいて、同じエージェント発話内容が同期して表示され、かつエージェント発話内容「温度が３６度を下回りました」の音声出力がそれぞれ行われる。 The communication app control unit 520 of each user terminal 500 displays the agent's utterance content in text format in the display field D, and performs automatic playback processing of the synthetic voice data to output the voice. At this time, the same agent's utterance content is displayed synchronously in the display field D of each user terminal 500, and the agent's utterance content "The temperature has fallen below 36 degrees" is output as voice.

続いて、エージェント発話内容を聞いたユーザＣが、「ちょっと手が離せません」と発話すると、コミュニケーションＡｐｐ制御部５２０は、発話音声を集音し、発話音声データを管理装置１００に送信する。管理装置１００の音声認識部１１３は、受信した発話音声データを音声認識処理し（１００５）、発話内容の音声認識結果を出力する。コミュニケーション制御部１１２は、音声認識結果をコミュニケーション履歴１２３に記憶し、発話音声データを記憶装置１２０に記憶する（Ｓ１００６）。 Next, when User C, having heard the agent's speech, says, "I can't take my hands off the phone," the communication App control unit 520 collects the speech and transmits the speech data to the management device 100. The speech recognition unit 113 of the management device 100 performs speech recognition processing on the received speech data (1005) and outputs the speech recognition result of the speech content. The communication control unit 112 stores the speech recognition result in the communication history 123, and stores the speech data in the storage device 120 (S1006).

コミュニケーション制御部１１２は、発話したユーザＣ以外の他のユーザ端末５００それぞれにユーザＣの発話音声データを同報送信する（１００８）。また、コミュニケーション履歴１２３に記憶したユーザＣの発話内容「ちょっと手が離せません」を、表示同期のために、ユーザＣ自身を含むコミュニケーショングループ内の各ユーザ端末５００に送信する（Ｓ１００７）。 The communication control unit 112 broadcasts the speech data of user C to each of the other user terminals 500 other than the user C who made the utterance (1008). In addition, the content of user C's utterance, "I'm busy right now," stored in the communication history 123, is transmitted to each user terminal 500 in the communication group including user C himself for display synchronization (S1007).

各ユーザ端末５００のコミュニケーションＡｐｐ制御部５２０は、受信した発話音声データの自動再生処理を行い、「ちょっと手が離せません」の発話音声出力を行い、音声出力された発話音声に対応するテキスト形式の発話内容「ちょっと手が離せません」を表示欄Ｄに表示させる。なお、発話したユーザＣのユーザ端末５００には、自分が発話した発話音声データが送信されないように管理装置１００側で制御される。 The communication app control unit 520 of each user terminal 500 performs an automatic playback process of the received speech voice data, outputs the speech voice of "I can't take my hands off the phone," and displays the text-format speech content "I can't take my hands off the phone" corresponding to the speech voice output in display field D. Note that the management device 100 controls the user terminal 500 of user C who made the speech so that the speech voice data uttered by the user C is not transmitted.

ユーザＣの発言を聞いたユーザＢは、「隣に居るので私が対応します」と発話すると、コミュニケーションＡｐｐ制御部５２０は、発話音声を集音し、発話音声データを管理装置１００に送信する。管理装置１００の音声認識部１１３は、受信した発話音声データを音声認識処理し（１００９）、発話内容の音声認識結果を出力する。コミュニケーション制御部１１２は、音声認識結果をコミュニケーション履歴１２３に記憶し、発話音声データを記憶装置１２０に記憶する（Ｓ１０１０）。 When User B, who has heard User C's statement, says, "I'm right next to you, so I'll handle it," the communication app control unit 520 collects the spoken voice and transmits the spoken voice data to the management device 100. The voice recognition unit 113 of the management device 100 performs voice recognition processing on the received spoken voice data (1009) and outputs the voice recognition result of the spoken content. The communication control unit 112 stores the voice recognition result in the communication history 123, and stores the spoken voice data in the storage device 120 (S1010).

コミュニケーション制御部１１２は、発話したユーザＢ以外の他のユーザ端末５００それぞれにユーザＢの発話音声データを同報送信する（１０１２）。また、コミュニケーション履歴１２３に記憶したユーザＢの発話内容「隣に居るので私が対応します」を、表示同期のために、ユーザＢ自身を含むコミュニケーショングループ内の各ユーザ端末５００に送信する（Ｓ１０１１）。 The communication control unit 112 broadcasts the speech data of user B to each of the other user terminals 500 other than user B who made the utterance (1012). In addition, the communication control unit 112 transmits the speech content of user B stored in the communication history 123, "I'm right next to you, so I'll handle it," to each user terminal 500 in the communication group including user B himself, for display synchronization (S1011).

各ユーザ端末５００のコミュニケーションＡｐｐ制御部５２０は、受信した発話音声データの自動再生処理を行い、「隣に居るので私が対応します」の発話音声出力を行い、音声出力された発話音声に対応するテキスト形式の発話内容「隣に居るので私が対応します」を表示欄Ｄに表示させる。このときも、発話したユーザＢのユーザ端末５００には、自分が発話した発話音声データが送信されないように管理装置１００側で制御される。 The communication app control unit 520 of each user terminal 500 performs automatic playback processing of the received speech voice data, outputs the speech voice "I'm right next to you, so I'll take care of it," and displays the text-format speech content "I'm right next to you, so I'll take care of it" corresponding to the speech voice output in display field D. At this time, too, the management device 100 controls the user terminal 500 of user B who made the speech so that the speech voice data uttered by the user himself is not transmitted.

（第２実施形態）
図８から図１１は、第２実施形態を説明するための図である。図８は、本実施形態に係るコミュニケーションシステムのネットワーク構成図である。本実施形態のコミュニケーションシステムは、上記第１実施形態に対し、ユーザ端末５００において発話されたユーザからの問い掛けに応じてエージェント機能を提供する態様である。なお、上記第１実施形態と同様の構成については、同符号を付して説明を省略する。 Second Embodiment
8 to 11 are diagrams for explaining the second embodiment. Fig. 8 is a network configuration diagram of a communication system according to this embodiment. In contrast to the first embodiment, the communication system of this embodiment provides an agent function in response to a question from a user spoken at a user terminal 500. Note that the same components as those in the first embodiment are denoted by the same reference numerals and description thereof will be omitted.

図９は、本実施形態のコミュニケーション管理装置１００、エージェント装置３００、ユーザ端末５００の各構成ブロック図である。第１実施形態の図２に対し、エージェント装置３００の構成が一部変更・追加され、ユーザ端末５００でのユーザの発話をトリガーに、エージェント装置３００が、検出情報に基づくエージェント発話テキストを生成して管理装置１００に送信する。 Figure 9 is a block diagram of the communication management device 100, agent device 300, and user terminal 500 of this embodiment. Compared to Figure 2 of the first embodiment, the configuration of the agent device 300 has been partially changed and added, and when triggered by a user's utterance on the user terminal 500, the agent device 300 generates agent utterance text based on the detection information and transmits it to the management device 100.

具体的には、管理装置１００のコミュニケーション制御部１１１は、ユーザ端末５００から受信した発話音声の音声認識結果をエージェント装置３００に送信する機能を備える。エージェント装置３００は、ユーザの発話音声の音声認識結果を受信するテキスト受信部３７０と、テキスト形式の音声認識結果を解析するテキスト解析部３８０と、テキスト解析部３８０の解析結果に基づいて、エージェント発話テキストを提供するか否かを判定する制御部（情報提供部）３３０Ａと、を備える。発話テキスト送信部３４０は、制御部３３０Ａの判定結果に基づいて、エージェント発話テキストを生成し、管理装置１００に送信する。 Specifically, the communication control unit 111 of the management device 100 has a function of transmitting the voice recognition result of the spoken voice received from the user terminal 500 to the agent device 300. The agent device 300 has a text receiving unit 370 that receives the voice recognition result of the user's spoken voice, a text analysis unit 380 that analyzes the voice recognition result in text format, and a control unit (information providing unit) 330A that determines whether to provide agent utterance text based on the analysis result of the text analysis unit 380. The utterance text sending unit 340 generates agent utterance text based on the determination result of the control unit 330A and sends it to the management device 100.

図１０は、本実施形態のコミュニケーションシステムの第２事例に基づく処理フローを示す図である。 Figure 10 shows the processing flow based on the second example of the communication system of this embodiment.

図１０に示すように、ユーザＣが、「今のＢ温泉の温度を教えて」と発話すると、コミュニケーションＡｐｐ制御部５２０は、発話音声を集音し、発話音声データを管理装置１００に送信する。管理装置１００の音声認識部１１３は、受信した発話音声データを音声認識処理し（１００５）、発話内容の音声認識結果を出力する。コミュニケーション制御部１１２は、音声認識結果をコミュニケーション履歴１２３に記憶し、発話音声データを記憶装置１２０に記憶する（Ｓ１００６）。 As shown in FIG. 10, when user C says, "Tell me the current temperature of hot spring B," the communication app control unit 520 collects the spoken voice and transmits the spoken voice data to the management device 100. The voice recognition unit 113 of the management device 100 performs voice recognition processing on the received spoken voice data (1005) and outputs the voice recognition result of the spoken content. The communication control unit 112 stores the voice recognition result in the communication history 123, and stores the spoken voice data in the storage device 120 (S1006).

コミュニケーション制御部１１２は、発話したユーザＣ以外の他のユーザ端末５００それぞれにユーザＣの発話音声データを同報送信する（１００８）。一方、コミュニケーション履歴１２３に記憶したユーザＣの発話内容「今のＢ温泉の温度を教えて」を、表示同期のために、ユーザＣ自身を含むコミュニケーショングループ内の各ユーザ端末５００に送信するとともに、エージェント装置３００にもテキスト形式の発話内容「今のＢ温泉の温度を教えて」を送信する（Ｓ１００７Ａ）。 The communication control unit 112 broadcasts the speech data of user C to each of the other user terminals 500 other than the user C who made the utterance (1008). Meanwhile, the speech content of user C stored in the communication history 123, "Tell me the current temperature of hot spring B," is transmitted to each user terminal 500 in the communication group including user C himself for display synchronization, and the speech content in text format, "Tell me the current temperature of hot spring B," is also transmitted to the agent device 300 (S1007A).

エージェント装置３００は、テキスト受信部３７０を介して「今のＢ温泉の温度を教えて」の発話テキストを受信する。受信した発話テキストは、テキスト解析部３８０によって解析され、例えば、周知の形態素解析を行って、キーワードを抽出する（Ｓ３１０１）。例えば、「Ｂ温泉」、「温度」、「教えて」の各キーワードを抽出する。 The agent device 300 receives the spoken text "Tell me the current temperature of hot spring B" via the text receiving unit 370. The received spoken text is analyzed by the text analysis unit 380, for example, by performing a well-known morphological analysis to extract keywords (S3101). For example, the keywords "Hot spring B," "temperature," and "tell me" are extracted.

エージェント装置３００の制御部（情報提供部）３３０Ａは、テキスト解析部３８０の解析結果であるキーワードを用いて、情報提供判定処理を行う（３１０２）。例えば、設定管理情報として、エージェント装置３００の管理対象の名称（Ｂ温泉）、エージェント装置３００に接続される状態検出機器によって検出される検出属性（温度）、質問文例示情報（「教えて」、「は？」、「いくつ」、「知りたい」）などを登録しておく。なお、本実施形態においても、これらの設定管理情報の登録処理は、設定管理部３５０を通じて行われる。 The control unit (information provision unit) 330A of the agent device 300 performs information provision determination processing using keywords that are the analysis results of the text analysis unit 380 (3102). For example, as setting management information, the name of the object managed by the agent device 300 (Hot Spring B), detection attributes (temperature) detected by a status detection device connected to the agent device 300, and example question sentence information ("Tell me," "Huh?", "How many," "I want to know") are registered. Note that in this embodiment as well, the registration processing of this setting management information is performed through the setting management unit 350.

制御部（情報提供部）３３０Ａは、ユーザＣの音声認識結果に、状態検出機器又は検出情報に対する問い掛けに関するキーワードが含まれているか否かを判定し、含まれていると判定された場合に（Ｓ３１０３のＹＥＳ）、センサ情報取得部３２０を通じて検出情報を取得する（３００１）。上記例示の場合、ユーザＣの音声認識結果に、「Ｂ温泉」が含まれており、かつ検出属性の「温度」と、質問文「教えて」が入っているので、制御部３３０Ａは、情報提供判定結果として「可」を出力する。 The control unit (information provision unit) 330A determines whether the voice recognition result of user C includes keywords related to the state detection device or a question about the detection information, and if it is determined that the keywords are included (YES in S3103), it acquires the detection information through the sensor information acquisition unit 320 (3001). In the above example, the voice recognition result of user C includes "B Hot Spring", and also includes the detection attribute "temperature" and the question "tell me", so the control unit 330A outputs "OK" as the information provision judgment result.

なお、上記説明では、複数のエージェント装置３００がコミュニケーショングループに登録されていることを想定し、各エージェント装置３００が自身に対する問い掛けかを判断するために、エージェント装置３００の管理対象の名称が含まれているかを判定要素として取り入れている。しかしながら、例えば、コミュニケーショングループに１つのエージェント装置３００しか登場しない場合は、「温度教えて」などのユーザの発話で、状態検出機器から検出情報を取得するように構成することができる。また、状態検出機器の名称（温度センサ）などを情報提供判定情報として登録し、ユーザＣが、「温度センサの値は？」というような問い掛けに対して、エージェント装置３００が、検出情報に基づく発話を行うように構成することができる。 In the above explanation, it is assumed that multiple agent devices 300 are registered in a communication group, and in order for each agent device 300 to determine whether a question is directed at itself, the inclusion of the name of the object managed by the agent device 300 is used as a determination factor. However, for example, if only one agent device 300 appears in a communication group, it is possible to configure the system so that detection information is obtained from the status detection device in response to a user utterance such as "Tell me the temperature." In addition, the name of the status detection device (temperature sensor) etc. can be registered as information provision determination information, and the agent device 300 can be configured to make a statement based on the detection information in response to a question from user C such as "What's the temperature sensor reading?".

エージェント装置３００のセンサ情報取得部３２０は、制御部３３０Ａの判定結果が「可」である場合に、状態検出機器（センサ機器１）が出力する温泉の温度情報を取得する（Ｓ３００１）。発話テキスト送信部３４０は、設定管理情報に設定されている発話テキストを抽出し、エージェント発話テキストデータ「現在の温度は３７．５度です」を生成する（Ｓ３００４）。発話テキスト送信部３４０は、生成したエージェント発話テキストを管理装置１００に送信する（Ｓ３００５）。このとき、定型文「現在の温度は○○度です」を設定管理情報として登録しておき、「○○」の部分を検出情報「３７．５」に置き換えて、エージェント発話テキストを生成することができる。 When the control unit 330A judges that the temperature is acceptable, the sensor information acquisition unit 320 of the agent device 300 acquires the hot spring temperature information output by the state detection device (sensor device 1) (S3001). The speech text transmission unit 340 extracts the speech text set in the setting management information, and generates agent speech text data "The current temperature is 37.5 degrees" (S3004). The speech text transmission unit 340 transmits the generated agent speech text to the management device 100 (S3005). At this time, the standard phrase "The current temperature is XX degrees" can be registered as setting management information, and the "XX" part can be replaced with the detection information "37.5" to generate the agent speech text.

各ユーザ端末５００のコミュニケーションＡｐｐ制御部５２０は、テキスト形式のエージェント発話内容を表示欄Ｄに表示させ、かつ合成音声データの自動再生処理を行い、音声出力を行う。このとき、各ユーザ端末５００の表示欄Ｄにおいて、同じエージェント発話内容が同期して表示され、かつエージェント発話内容「現在の温度は○○度です」の音声出力がそれぞれ行われる。 The communication app control unit 520 of each user terminal 500 displays the agent's utterance content in text format in the display field D, and performs automatic playback processing of the synthetic voice data to output the voice. At this time, the same agent's utterance content is displayed synchronously in the display field D of each user terminal 500, and the agent's utterance content "The current temperature is XX degrees" is output as voice.

続いて、エージェント発話内容を音声で聞いたユーザＣが、「基準温度よりも高いですがボイラー入れてください」と発話すると、コミュニケーションＡｐｐ制御部５２０は、発話音声を集音し、発話音声データを管理装置１００に送信する。管理装置１００の音声認識部１１３は、受信した発話音声データを音声認識処理し（１００９）、発話内容の音声認識結果を出力する。コミュニケーション制御部１１２は、音声認識結果をコミュニケーション履歴１２３に記憶し、発話音声データを記憶装置１２０に記憶する（Ｓ１０１０）。 Next, when User C, who has heard the agent's speech, says, "It's higher than the reference temperature, please turn on the boiler," the communication App control unit 520 collects the speech and transmits the speech data to the management device 100. The speech recognition unit 113 of the management device 100 performs speech recognition processing on the received speech data (1009) and outputs the speech recognition result of the speech content. The communication control unit 112 stores the speech recognition result in the communication history 123, and stores the speech data in the storage device 120 (S1010).

コミュニケーション制御部１１２は、発話したユーザＣ以外の他のユーザ端末５００それぞれにユーザＣの発話音声データを同報送信する（１０１２）。また、コミュニケーション履歴１２３に記憶したユーザＣの発話内容「基準温度よりも高いですがボイラー入れてください」を、表示同期のために、ユーザＣ自身を含むコミュニケーショングループ内の各ユーザ端末５００に送信する（Ｓ１０１２）。 The communication control unit 112 broadcasts the speech data of user C to each of the other user terminals 500 other than the user C who made the utterance (1012). In addition, the content of user C's utterance stored in the communication history 123, "It's higher than the reference temperature, but please turn on the boiler," is transmitted to each user terminal 500 in the communication group including user C himself for display synchronization (S1012).

図１１は、本実施形態のユーザ端末５００に表示される画面例である。図１１に示すように、各ユーザ端末５００は、自分の発話内容及び自分以外の他のユーザの発話内容と共に、エージェント装置３００への問い掛けや呼び掛けに対する発話内容と、問い掛けや呼び掛けをトリガーに発話したエージェント装置３００の発話内容が表示欄Ｄに時系列に表示される。そして、管理装置１００に蓄積されるコミュニケーション履歴１２３がログ情報として共有される。 Figure 11 is an example of a screen displayed on the user terminal 500 of this embodiment. As shown in Figure 11, each user terminal 500 chronologically displays in display field D the contents of the user's own speech and the contents of speech of other users, as well as the contents of speech in response to questions or calls made to the agent device 300, and the contents of speech made by the agent device 300 triggered by the questions or calls. The communication history 123 stored in the management device 100 is then shared as log information.

本実施形態は、エージェント装置３００が、ユーザの問い掛けや呼び掛けを理解して、その都度、状態検出機器の検出情報に基づくエージェント発話テキストを生成して提供するので、エージェント装置３００がコミュニケーショングループ内の疑似ユーザとして登場し、ユーザ同士の情報伝達の会話により近いコミュニケーション環境を提供することができる。 In this embodiment, the agent device 300 understands the user's questions and calls, and generates and provides agent utterance text based on the detection information of the status detection device each time. Therefore, the agent device 300 appears as a pseudo user in the communication group, and a communication environment that is closer to a conversation in which users communicate information between each other can be provided.

また、上記例示の施設以外にも、警備業におけるビル施設、物流業のバース（発着所）などがある。また、状態検出機器は、温度センサ以外にも、本コミュニケーションシステムの適用シーンに合わせて、様々な検出機器を用いることができる。 In addition to the facilities listed above, other examples include building facilities in the security industry and berths (departure and arrival points) in the logistics industry. In addition to temperature sensors, various other detection devices can be used as condition detection devices depending on the application scenario of this communication system.

例えば、状態検出機器として、カメラがある。人の動きや混雑度を、カメラで撮影した画像を用いて解析・判定し、「浴場に多数移動した」、「フロントに行列ができている」などの解析結果をトリガーに、エージェント装置３００がこれらの解析結果に対するエージェント発話テキストを管理装置１００に送信し、ユーザ端末５００に合成音声通知及びテキスト表示通知を行うことができる。また、混雑等に関する他の例としては、例えば、駐車場の混雑度を解析・判定し、「もうすぐ駐車場が満車になります」、「第２駐車場の準備をお願いします」などをユーザ端末５００に合成音声通知及びテキスト表示通知を行うこともできる。 For example, a camera is an example of a status detection device. Images captured by the camera are used to analyze and determine people's movements and the degree of congestion. Analysis results such as "a large number of people have moved to the bath area" or "there is a line at the front desk" are used as triggers for the agent device 300 to send agent utterance text in response to these analysis results to the management device 100, and a synthetic voice notification and a text display notification can be made to the user terminal 500. As another example of congestion, for example, the degree of congestion in a parking lot can be analyzed and determined, and a synthetic voice notification and a text display notification can be made to the user terminal 500, such as "the parking lot will soon be full" or "please prepare the second parking lot."

また、エージェント装置３００が、特定の人物をカメラ画像から抽出する機能を備えるように構成することもできる。この場合、例えば、予め登録された人物画像と撮影された画像とのマッチング処理を行い、状態検出機器であるカメラが設置された場所の情報を用いて、「誰が何処に到着した」という解析結果を得ることができる。このような解析結果をトリガーとして使用し、例えば、「～さんが、～にいます」というエージェント発話テキストをエージェント装置３００が出力し、管理装置１００を通じて各ユーザ端末５００に合成音声で通知することができる。 The agent device 300 can also be configured to have a function for extracting a specific person from a camera image. In this case, for example, a matching process can be performed between a preregistered person image and a captured image, and an analysis result such as "who arrived where" can be obtained using information on the location where the camera, which is a status detection device, is installed. Such an analysis result can be used as a trigger to cause the agent device 300 to output agent utterance text such as "Mr./Ms. ~ is at ~", and a notification can be sent to each user terminal 500 in a synthetic voice via the management device 100.

また、他の例としては、状態検出機器として重量センサを適用することができる。例えば、エレベーター等に使われている重量センサと連携し、１０分間に５回以上重量オーバーの発生を検知したことをトリガーに、エージェント装置３００が「エレベーターが混雑しています」などのエージェント発話テキストを出力し、管理装置１００を通じて合成音声で各ユーザ端末５００（各ユーザ）に通知する。各ユーザは、必要に応じて人通り整理に向かうことができる。 As another example, a weight sensor can be used as a status detection device. For example, by linking with a weight sensor used in elevators, etc., and detecting five or more occurrences of weight exceeding the limit within a ten-minute period, the agent device 300 will output an agent speech text such as "The elevator is crowded," and notify each user terminal 500 (each user) by synthetic voice via the management device 100. Each user can then head out to direct pedestrian traffic as necessary.

さらに、状態検出機器としてＧＰＳ装置（位置情報検出機器）を適用することができる。例えば、人力で引く荷車などにＧＰＳ装置を取り付けておき、エージェント装置３００は、ＧＰＳ装置から荷車の位置情報を取得可能に構成する。そして、エージェント装置３００は、予め設定されたルートや進入禁止場所と荷車の現在位置とをマッチングし、ルートから所定の範囲ズレていることや進入禁止場所への侵入を検知することができる。そして、これらを検知した場合に、「ルートは間違っていませんか？」、「そこは進入禁止場所です」といったエージェント発話テキストを出力し、管理装置１００を通じて合成音声で各ユーザ端末５００（各ユーザ）に通知する。このとき、進入禁止場所への侵入は、ユーザ端末５００のユーザ以外にも施設利用者も想定される。この場合、通知を受けた各ユーザ端末５００のユーザは、進入禁止場所へ向かい、施設利用者に適切にガイドすることができる。 Furthermore, a GPS device (position information detection device) can be applied as the state detection device. For example, a GPS device is attached to a cart pulled by human power, and the agent device 300 is configured to be able to obtain the position information of the cart from the GPS device. The agent device 300 can match the current position of the cart with a preset route or a prohibited entry place, and detect a predetermined range of deviation from the route or an intrusion into a prohibited entry place. When the agent device 300 detects these, it outputs an agent speech text such as "Is the route wrong?" or "That is a prohibited entry place," and notifies each user terminal 500 (each user) by synthetic voice through the management device 100. At this time, it is assumed that not only the user of the user terminal 500 but also facility users will intrude into the prohibited entry place. In this case, the user of each user terminal 500 who receives the notification can head to the prohibited entry place and appropriately guide the facility users.

なお、コミュニケーション管理装置１００が、エージェント装置３００の各機能を備えるように構成することもできる。すなわち、図２又は図９で示したエージェント装置３００の機能が、コミュニケーション管理装置１００内にエージェント部として設けられる。そして、状態検出機器による検出情報が、コミュニケーション管理装置１００に送信されるように構成する。このとき、状態検出機器は、データ通信機能を内的に備えていてもよく、また、状態検出機器を個別のデータ通信機器に接続し、データ通信機器を介してコミュニケーション管理装置１００の検出情報を送信できるように構成してもよい。コミュニケーション管理装置１００のエージェント部は、監視対象の状態検出機器から出力される検出情報を受け付け、検出情報に基づくエージェント発話テキストを生成し、上記実施形態同様に、コミュニケーショングループの一員として動作することができる。 The communication management device 100 can also be configured to have the functions of the agent device 300. That is, the functions of the agent device 300 shown in FIG. 2 or FIG. 9 are provided as an agent unit within the communication management device 100. Then, the detection information by the status detection device is configured to be transmitted to the communication management device 100. In this case, the status detection device may be internally equipped with a data communication function, or the status detection device may be connected to a separate data communication device so that the detection information of the communication management device 100 can be transmitted via the data communication device. The agent unit of the communication management device 100 can receive detection information output from the monitored status detection device, generate agent utterance text based on the detection information, and operate as a member of a communication group, similar to the above embodiment.

（第３実施形態）
図１２から図１５は、第３実施形態を説明するための図である。なお、上記第１，第２実施形態と同様の構成については、同符号を付して説明を省略する。 Third Embodiment
12 to 15 are diagrams for explaining the third embodiment. Note that the same components as those in the first and second embodiments are given the same reference numerals and the description thereof will be omitted.

本実施形態のコミュニケーション管理装置１００は、上述のグループ通話機能に加えて、個別通話機能を備えている。図１２は、本実施形態のグループ通話モード中の個別通話モード割り込み処理の一例を説明するための図である。図１２に示すように、エージェント装置３００から発信されるエージェント発話テキストに基づく合成音声を、グループ通話中のコミュニケーショングループ内の特定のユーザだけに向けて送信する。 The communication management device 100 of this embodiment has an individual call function in addition to the group call function described above. FIG. 12 is a diagram for explaining an example of an individual call mode interrupt process during group call mode in this embodiment. As shown in FIG. 12, a synthetic voice based on the agent utterance text sent from the agent device 300 is transmitted only to a specific user in the communication group during the group call.

上述のように、エージェント装置３００は、コミュニケーショングループ内のメンバ（エージェント）として登録される。本実施形態では、管理装置１００を通じたエージェントと特定のユーザとの間の個別通話機能を提供する。 As described above, the agent device 300 is registered as a member (agent) in a communication group. In this embodiment, an individual call function is provided between the agent and a specific user through the management device 100.

図１３は、本実施形態の管理装置（コミュニケーション管理装置）１００、エージェント装置３００、ユーザ端末５００の各構成ブロック図である。図１３に示すように、上記第１実施形態及び第２実施形態において説明した第１制御部及び第２制御部は、グループ通話制御部１１２Ａとして示されている。コミュニケーション制御部１１２は、グループ通話制御部１１２Ａ及び個別通話制御部１１２Ｂを備えるように構成される。 Figure 13 is a block diagram of the management device (communication management device) 100, agent device 300, and user terminal 500 of this embodiment. As shown in Figure 13, the first control unit and second control unit described in the first and second embodiments are shown as a group call control unit 112A. The communication control unit 112 is configured to include a group call control unit 112A and an individual call control unit 112B.

管理装置１００は、コミュニケーショングループに登録された複数のユーザを含むグループメンバリストを生成し、保持している。個別通話制御部１１２Ｂは、エージェント装置３００から送信される個別通話要求に基づいて、グループメンバリストから該当するユーザを指定する。 The management device 100 generates and stores a group member list that includes multiple users registered in a communication group. The individual call control unit 112B selects a relevant user from the group member list based on an individual call request sent from the agent device 300.

個別通話制御部１１２Ｂは、グループ通話で同報配信されるコミュニケーショングループ内のユーザを対象に、特定のユーザだけに向けて発話音声データを送信する個別通話機能を提供する。個別通話制御部１１２Ｂは、グループ通話モード中に、管理装置１００を通じてエージェント装置３００が特定のユーザと一対一で通話を行うために、指定されたユーザに対してコール（呼）を発信するコール処理を行う。コール処理は、維持されているグループ通話モードに対する割り込み処理であり、コール処理に対してユーザが応答すると、呼接続処理（個別通話通信チャネルの確立処理）を行う。これにより、確立された通話チャネルを通じ、エージェントから特定のユーザだけに向けた発話音声データの配信処理が開始される。これらの処理全体は、コミュニケーショングループ内のグループ通話状態を維持しつつ、特定のユーザをコミュニケーショングループ内の他のユーザとは区画した状態で通話を行うための個別通話割り込み処理として実行される。 The individual call control unit 112B provides an individual call function that transmits speech voice data to a specific user only among users in a communication group that is broadcast in a group call. The individual call control unit 112B performs call processing to make a call to a specified user so that the agent device 300 can make a one-to-one call with the specific user through the management device 100 during group call mode. The call processing is an interrupt process for the maintained group call mode, and when the user responds to the call processing, a call connection process (establishment process of an individual call communication channel) is performed. This starts the distribution process of speech voice data from the agent to the specific user only through the established call channel. All of these processes are performed as individual call interruption processing to make a call while maintaining the group call state within the communication group and separating the specific user from other users in the communication group.

なお、本実施形態の個別通話機能は、エージェント以外の２人のユーザ間にも適用可能である。つまり、管理装置１００は、コミュニケーショングループに登録された複数のユーザを含むグループメンバリストを、事前に各ユーザ端末５００に配信することができる。そして、ユーザ端末５００は、グループメンバリストから個別通話相手のユーザが選択されると、選択されたユーザを含む個別通話要求を管理装置１００に送信することができる。個別通話制御部１１２Ｂは、選択されたユーザに対してコール処理を行い、コールされたユーザの応答アクションに基づいて、個別通話通信チャネルの確立することができる。 The individual call function of this embodiment can also be applied between two users other than agents. That is, the management device 100 can distribute a group member list including multiple users registered in a communication group to each user terminal 500 in advance. Then, when a user to be an individual call partner is selected from the group member list, the user terminal 500 can send an individual call request including the selected user to the management device 100. The individual call control unit 112B performs call processing for the selected user and can establish an individual call communication channel based on the response action of the called user.

なお、個別通話制御部１１２Ｂは、グループ通話モード中でなくても、個別通話要求を受け付け、指定又は選択されたユーザとの間で個別通話チャネルを開き、一対一で通話機能を提供することもできる。 In addition, even if the individual call control unit 112B is not in group call mode, it can accept an individual call request, open an individual call channel with a specified or selected user, and provide a one-to-one call function.

個別通話終了後は、コミュニケーショングループ内で維持されているグループ通話モードへの自動復帰処理を行うことができる。自動復帰処理は、コミュニケーション制御部１１２によって遂行される。ユーザ端末５００において個別通話モードに対する切断操作が行われると、コミュニケーション制御部１１２は、確立していた個別通話チャネルの切断処理を行って、実行中のグループ通話モードの通話チャネルに自動復帰させる。また、個別通話制御部１１２Ｂ側からの個別通話通信チャネルの切断処理に伴って、グループ通話モードへの自動復帰を行うように構成してもよい。 After the individual call ends, an automatic return process to the group call mode maintained within the communication group can be performed. The automatic return process is performed by the communication control unit 112. When a disconnection operation for the individual call mode is performed on the user terminal 500, the communication control unit 112 performs a disconnection process of the established individual call channel, and automatically returns to the call channel of the active group call mode. In addition, it may be configured to automatically return to the group call mode in conjunction with the disconnection process of the individual call communication channel from the individual call control unit 112B side.

個別通話モードの通話時間（コール開始時刻、呼応答後の通話時間、通話終了時刻）は、個別通話モード実行履歴として、個別通話相手の履歴と共に管理装置１００に蓄積される。また、個別通話中の発話音声データは、グループ通話モードと同様に、音声認識処理されてテキスト化し、コミュニケーション履歴情報１２３内に、又はコミュニケーション履歴情報１２３の時間経過と紐付けて個別に格納することができる。個別通話モード中の発話音声データも同様に記憶装置１２０に格納することができる。 The call duration in the individual call mode (call start time, call duration after call answering, and call end time) is stored in the management device 100 as an individual call mode execution history, together with the history of the individual call partner. In addition, as in the group call mode, the voice data uttered during the individual call can be processed through voice recognition and converted into text, and stored individually in the communication history information 123, or linked to the passage of time in the communication history information 123. The voice data uttered during the individual call mode can also be stored in the storage device 120.

このように本実施形態の管理装置１００（通信装置１３０）は、グループ通話機能に対応して、一のユーザによる発話音声データ及びその発話内容のテキスト情報（発話音声データを音声認識処理して得られたテキスト情報）を複数の各ユーザ端末５００に一斉に送る同報配信通信制御を行う。また、個別通話機能に対応して、特定のユーザ（個別通話ユーザ）間での発話音声データの個別配信通信制御を行う。 In this way, the management device 100 (communication device 130) of this embodiment performs broadcast communication control to simultaneously send speech voice data by one user and text information of the speech content (text information obtained by processing the speech voice data into voice recognition) to multiple user terminals 500 in response to the group call function. Also, in response to the individual call function, it performs individual distribution communication control of speech voice data between specific users (individual call users).

次に、エージェント装置３００は、図１４に示す特定通知設定情報を事前に保持することができる。図１４に示すように、状況判定条件が設定され、各条件毎に個別通話で連絡をする特定ユーザが決められている。また、連絡する内容（エージェント発話テキスト）も予め設定されている。 Next, the agent device 300 can hold in advance the specific notification setting information shown in FIG. 14. As shown in FIG. 14, the situation determination conditions are set, and specific users to be contacted in individual calls are determined for each condition. In addition, the content of the contact (agent utterance text) is also set in advance.

図１４に示す特定通知設定情報は、上記第１，第２実施形態における図５の設定管理情報に、連絡するユーザ（特定ユーザ、ユーザ特性）と、連絡手段としての回線種別（個別通話、グループ通話）が追加された情報である。図５の判定条件が、図１４の状況判定条件に相当する。 The specific notification setting information shown in FIG. 14 is information in which the user to be contacted (specific user, user characteristics) and the line type (individual call, group call) as the contact method are added to the setting management information in FIG. 5 in the first and second embodiments. The judgment conditions in FIG. 5 correspond to the situation judgment conditions in FIG. 14.

図１５は、本実施形態のコミュニケーションシステムの第３事例に基づく処理フローを示す図である Figure 15 shows the processing flow based on the third example of the communication system of this embodiment.

エージェント装置３００の制御部（判定部）３３０は、監視対象のセンサ機器（状態検出機器）１から出力される検出情報を受け付け（Ｓ３００１）、特定通知設定情報の「状況判定条件」とマッチングする（Ｓ３００２）。受け付けた検出情報が、状況判定条件を満たすか否かを判定し（Ｓ３００３）、状況判定条件を満たすと判定された場合（Ｓ３００３のＹＥＳ）、予め設定された発話テキストを抽出し（Ｓ３００４）、連絡するユーザ、回線種別及び発話テキストの各情報を含む連絡要求を、管理装置１００に送信する（Ｓ３００５）。 The control unit (determination unit) 330 of the agent device 300 receives detection information output from the sensor device (status detection device) 1 to be monitored (S3001) and matches it with the "status determination condition" of the specific notification setting information (S3002). It determines whether the received detection information satisfies the status determination condition (S3003), and if it is determined that the status determination condition is satisfied (YES in S3003), it extracts a pre-set speech text (S3004) and sends a contact request including information on the user to be contacted, the line type, and the speech text to the management device 100 (S3005).

管理装置１００は、エージェント装置３００から連絡要求を受け付けると、音声合成部１１４が、受信したエージェント発話テキストの音声合成データを生成する（Ｓ１００１）。 When the management device 100 receives a contact request from the agent device 300, the voice synthesis unit 114 generates voice synthesis data for the received agent utterance text (S1001).

次に、コミュニケーション制御部１１２は、受け付けた連絡要求に含まれる回線種別と連絡先である特定ユーザを参照し、特定ユーザへの個別通話設定があるか否かを確認する（Ｓ１００１Ａ）。回線種別が「グループ通話」であれば、ステップＳ１００２に進み、個別通話モードではなく、グループ通話モードで連絡処理を行う（Ｓ１００３，Ｓ１００４）。このとき、コミュニケーション履歴１２３に、発話テキスト等が時系列に蓄積される（Ｓ１００２）。 Next, the communication control unit 112 checks whether an individual call setting is made to the specific user by referring to the line type and the specific user who is the contact included in the accepted contact request (S1001A). If the line type is "group call", the process proceeds to step S1002, and the contact process is made in group call mode, not individual call mode (S1003, S1004). At this time, the spoken text, etc. are accumulated in chronological order in the communication history 123 (S1002).

ステップＳ１００１Ａにおいて、特定ユーザへの個別通話設定があると判定された場合（Ｓ１００１ＡのＹＥＳ）、個別通話制御部１１２Ｂは、現在のグループ通話モードに対し、連絡要求に含まれる特定ユーザへの個別通話モード（割り込み）処理を行う（Ｓ１００１Ｂ）。具体的には、個別通話通信チャネルを用いて、特定ユーザへのコール処理を行う（１００１Ｃ）。コールを受けた特定ユーザは、着信呼に対する応答操作を行う（Ｓ５０４ａ）。特定ユーザが着信呼に対する応答する旨の操作を行うと、管理装置１００は、個別通話通信チャネルで、管理装置１００と特定ユーザとの個別通話回線を確立する通話処理を行う（Ｓ１００１Ｄ）。個別通話制御部１１２Ｂは、個別通話回線を通じて、エージェント発話テキストの音声合成データを、特定ユーザのユーザ端末５００に配信する。したがって、エージェントと特定ユーザとの間の個別通話回線による連絡が実現される。 In step S1001A, if it is determined that an individual call setting for a specific user exists (YES in S1001A), the individual call control unit 112B performs individual call mode (interrupt) processing for the specific user included in the contact request for the current group call mode (S1001B). Specifically, a call process for the specific user is performed using an individual call communication channel (1001C). The specific user who receives the call performs an operation to answer the incoming call (S504a). When the specific user performs an operation to answer the incoming call, the management device 100 performs call processing to establish an individual call line between the management device 100 and the specific user through the individual call communication channel (S1001D). The individual call control unit 112B delivers voice synthesis data of the agent's utterance text to the user terminal 500 of the specific user through the individual call line. Thus, contact between the agent and the specific user through an individual call line is realized.

なお、個別通話モードに移行した特定ユーザは、グループ通話の通話チャネルに対して「保留」と同じ取り扱いとなり、個別通話終了後はグループ通話の通信チャネルに自動復帰することができる。また、コミュニケーション制御部１１２は、個別通話モードを利用した特定ユーザへの連絡履歴も、コミュニケーション履歴１２３に格納する（Ｓ１００２）。 Note that a specific user who has switched to individual call mode is treated the same as a "hold" on the group call channel, and can automatically return to the group call communication channel after the individual call ends. The communication control unit 112 also stores the contact history of the specific user who used the individual call mode in the communication history 123 (S1002).

また、エージェントに対する個別通話相手は、２人以上であってよい。この場合、各特定ユーザに対して別々の個別通話チャネルを確立し、それぞれにエージェント発話テキストに基づく音声合成データを配信することができる。また、個別通話相手別に異なるエージェント発話テキストを設定することもできる。つまり、図１４の例のように、フロアマネージャーには「温度が閾値を下回りました。設定ユーザへの要対応通知を行います。」のエージェント発話テキストを設定し、有資格者（例えば、ボイラー技士）には、「至急、温度調節作業をお願いします。」のエージェント発話テキストを設定することができる。フロアマネージャー及び有資格者は、同じ状況判定条件）に対して異なる発話テキストに基づく音声合成データが配信されることになる。 In addition, there may be two or more individual call partners for an agent. In this case, a separate individual call channel can be established for each specific user, and voice synthesis data based on the agent's speech text can be delivered to each of them. Also, different agent speech texts can be set for different individual call partners. That is, as in the example of Figure 14, the agent speech text can be set for the floor manager as "The temperature has fallen below the threshold. A notification will be sent to the set user requesting action," and the agent speech text can be set for a qualified person (e.g., a boiler engineer) as "Please adjust the temperature as soon as possible." Voice synthesis data based on different speech texts will be delivered to the floor manager and the qualified person for the same situation determination condition.

また、連絡先ユーザは、予め設定されたユーザでなくてもよい。図１４の例のように、各ユーザ（ユーザ端末）の位置情報を予め把握し、状況判定条件を満たすことで発生した事象への対応者として、その事象が発生した場所に近い１人又は２人以上のユーザを特定ユーザとして決定することができる。図１４の例では、進入禁止エリアで進入が検知された場合に、ユーザの位置情報に応じて特定ユーザを選定し、発話テキスト「進入禁止エリアでセンサ検知が発生しました。近距離ユーザとして対応願います。」の音声合成データを、選定した特定ユーザに配信することができる。 In addition, the contact user does not have to be a preset user. As in the example of FIG. 14, the location information of each user (user terminal) can be known in advance, and one or more users close to the location where the event occurred can be determined as specific users to respond to the event that occurs when the situation determination conditions are met. In the example of FIG. 14, when entry into a no-entry area is detected, a specific user can be selected according to the user's location information, and speech synthesis data for the spoken text "Sensor detection has occurred in a no-entry area. Please respond as a nearby user" can be delivered to the selected specific user.

なお、上述したように、管理装置１００が、エージェント装置３００の各機能を備えるように構成することもできるので、本実施形態の変形例として、管理装置１００内に、エージェント装置３００に相当するエージェント機能部を備えるように構成する。そして、管理装置１００が、センサ機器１から検出情報を受け付けて、図１５のステップＳ３００２、Ｓ３００３、及びＳ３００４を遂行し、グループ通話中に個別通話モードでの連絡を行うように構成することができる。 As described above, the management device 100 can be configured to have the functions of the agent device 300. As a modification of this embodiment, the management device 100 is configured to have an agent function unit equivalent to the agent device 300. The management device 100 can then be configured to receive detection information from the sensor device 1, perform steps S3002, S3003, and S3004 in FIG. 15, and communicate in individual call mode during a group call.

以上、本実施形態について説明したが、コミュニケーション管理装置１００及びエージェント装置３００の各機能は、プログラムによって実現可能であり、各機能を実現するために予め用意されたコンピュータプログラムが補助記憶装置に格納され、ＣＰＵ等の制御部が補助記憶装置に格納されたプログラムを主記憶装置に読み出し、主記憶装置に読み出された該プログラムを制御部が実行することで、各部の機能を動作させることができる。 The present embodiment has been described above, but each function of the communication management device 100 and the agent device 300 can be realized by a program, and computer programs prepared in advance to realize each function are stored in an auxiliary storage device, and a control unit such as a CPU reads the program stored in the auxiliary storage device into a main storage device, and the control unit executes the program read into the main storage device, thereby operating the functions of each unit.

また、上記プログラムは、コンピュータ読取可能な記録媒体に記録された状態で、コンピュータに提供することも可能である。コンピュータ読取可能な記録媒体としては、ＣＤ－ＲＯＭ等の光ディスク、ＤＶＤ－ＲＯＭ等の相変化型光ディスク、ＭＯ（Magnet Optical）やＭＤ(Mini Disk)などの光磁気ディスク、フロッピー（登録商標）ディスクやリムーバブルハードディスクなどの磁気ディスク、コンパクトフラッシュ（登録商標）、スマートメディア、SDメモリカード、メモリスティック等のメモリカードが挙げられる。また、本発明の目的のために特別に設計されて構成された集積回路（ICチップ等）等のハードウェア装置も記録媒体として含まれる。 The above program can also be provided to a computer in a state in which it is recorded on a computer-readable recording medium. Examples of computer-readable recording media include optical disks such as CD-ROMs, phase-change optical disks such as DVD-ROMs, magneto-optical disks such as MO (Magnet Optical) and MD (Mini Disk), magnetic disks such as floppy (registered trademark) disks and removable hard disks, and memory cards such as Compact Flash (registered trademark), Smart Media, SD memory cards, and memory sticks. Also included as recording media are hardware devices such as integrated circuits (IC chips, etc.) that are specially designed and configured for the purposes of the present invention.

なお、本発明の実施形態を説明したが、当該実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。この新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 Although an embodiment of the present invention has been described, this embodiment is presented as an example and is not intended to limit the scope of the invention. This new embodiment can be embodied in various other forms, and various omissions, substitutions, and modifications can be made without departing from the gist of the invention. These embodiments and their modifications are included in the scope and gist of the invention, and are included in the scope of the invention and its equivalents described in the claims.

１００コミュニケーション管理装置
１１０制御装置
１１１ユーザ管理部
１１２コミュニケーション制御部（第１制御部，第２制御部）
１１２Ａグループ通話制御部（第１制御部，第２制御部）
１１２Ｂ個別通話制御部
１１３音声認識部
１１４音声合成部
１２０記憶装置
１２１ユーザ情報
１２２グループ情報
１２３コミュニケーション履歴情報
１２４音声認識辞書
１２５音声合成辞書
１３０通信装置
３００エージェント装置
３１０通信部
３２０センサ情報取得部
３３０制御部（判定部）
３３０Ａ制御部（情報提供部）
３４０発話テキスト送信部
３５０設定管理部
３６０記憶部
３７０テキスト受信部
３８０テキスト解析部
５００ユーザ端末（移動通信端末）
５１０通信・通話部
５２０コミュニケーションＡｐｐ制御部
５３０マイク（集音部）
５４０スピーカー（音声出力部）
５５０表示・入力部
５６０記憶部
Ｄ表示欄 100 Communication management device 110 Control device 111 User management unit 112 Communication control unit (first control unit, second control unit)
112A Group call control unit (first control unit, second control unit)
112B Individual call control unit 113 Voice recognition unit 114 Voice synthesis unit 120 Storage device 121 User information 122 Group information 123 Communication history information 124 Voice recognition dictionary 125 Voice synthesis dictionary 130 Communication device 300 Agent device 310 Communication unit 320 Sensor information acquisition unit 330 Control unit (determination unit)
330A Control Unit (Information Providing Unit)
340 Speech text transmission unit 350 Setting management unit 360 Storage unit 370 Text reception unit 380 Text analysis unit 500 User terminal (mobile communication terminal)
510 Communication/talk unit 520 Communication App control unit 530 Microphone (sound collection unit)
540 Speaker (audio output unit)
550 Display/input unit 560 Memory unit D Display field

Claims

A communication system in which a user's speech is broadcast to other users' mobile communication terminals through a mobile communication terminal carried by each of a plurality of users,
a communication management device to which each of the mobile communication terminals is connected by wireless communication; and an agent device to which detection information output from a state detection device of a monitoring target is input and which is connected to the communication management device,
The communication management device includes:
a communication control unit including a first control unit that broadcasts speech data received from a mobile communication terminal to each of a plurality of other mobile communication terminals, and a second control unit that accumulates a speech recognition result obtained by performing speech recognition processing on the received speech data in chronological order as a communication history between users, and controls text distribution so that the communication history is displayed synchronously on each of the mobile communication terminals;
the agent device includes an utterance text sending unit that generates an agent utterance text based on the detection information and sends the agent utterance text to the communication management device;
The communication control unit broadcasts synthetic voice data of the agent utterance text generated by synthetic voice processing to each of the multiple mobile communication terminals, includes the received agent utterance text in the communication history between users, and accumulates it in chronological order, and controls text distribution to each of the mobile communication terminals.

the communication management device includes a user management unit in which a plurality of the mobile communication terminals are registered and which sets a communication group to be controlled by the first control unit and the second control unit;
2. The communication system according to claim 1, wherein said user management unit provides a function for registering said agent device in said communication group.

the agent device further includes a control unit that determines whether the detection information satisfies a preset judgment condition,
3. The communication system according to claim 1, wherein the speech text sending unit generates the agent speech text when it is determined that the detection information satisfies the determination condition.

the communication control unit transmits the speech recognition result to the agent device;
The agent device includes:
a text receiving unit for receiving the speech recognition result;
and an information providing unit that determines whether or not to provide the agent utterance text based on the speech recognition result,
4. The communication system according to claim 1, wherein the utterance text sending unit generates the agent utterance text based on a result of the determination by the information providing unit, and sends the generated utterance text to the communication management device.

The communication system according to claim 4, characterized in that the information providing unit determines whether the speech recognition result includes a keyword related to the state detection device or a question about the detection information.

The communication control unit is
an individual call control unit for transmitting speech voice data to only specific users in a communication group to which the speech voice data is distributed by broadcasting,
The communication system according to any one of claims 1 to 5, characterized in that the individual call control unit performs individual call control to transmit synthetic voice data of the agent utterance text generated by synthetic voice processing to a specific user.

A communication method in which a user's utterance voice is broadcast to the mobile communication terminals of other users through a mobile communication terminal carried by each of a plurality of users, the mobile communication terminals being connected to a communication management device by wireless communication, and an agent device to which detection information output from a state detection device of a monitored object is input is connected to the communication management device;
A first step in which a communication management device broadcasts speech voice data received from a mobile communication terminal to each of a plurality of other mobile communication terminals;
a second step of storing a speech recognition result obtained by speech recognition processing of the received speech data in a chronological order as a communication history between users by the communication management device, and controlling text distribution so that the communication history is displayed synchronously on each of the mobile communication terminals;
a third step of the agent device generating an agent utterance text based on the detection information and transmitting the agent utterance text to the communication management device;
The first step includes broadcasting synthetic voice data of the agent utterance text generated by synthetic voice processing to each of the plurality of mobile communication terminals;
The communication method is characterized in that the second step includes storing the received agent utterance text in chronological order by including it in the communication history between users, and controlling delivery of the text to each of the mobile communication terminals.

A program executed by a management device that is connected to a mobile communication terminal carried by each of a plurality of users via wireless communication and broadcasts a user's speech to the mobile communication terminals of other users,
A first function of broadcasting speech voice data received from a mobile communication terminal to each of a plurality of other mobile communication terminals;
a second function of accumulating a speech recognition result obtained by performing speech recognition processing on the received speech data in chronological order as a communication history between users, and controlling text distribution so that the communication history is displayed synchronously on each of the mobile communication terminals;
a third function of receiving detection information output from a state detection device of a monitored object, receiving an agent utterance text based on the detection information generated by an agent device connected to the management device, and generating synthetic voice data of the agent utterance text,
The first function distributes synthetic voice data of the agent utterance text to each of the plurality of mobile communication terminals by broadcasting the synthetic voice data,
The second function is a program that includes the received agent utterance text in the communication history between users, accumulates the text in chronological order, and controls delivery of the text to each of the mobile communication terminals.

A communication system in which a user's speech is broadcast to other users' mobile communication terminals through a mobile communication terminal carried by each of a plurality of users,
a communication control unit including a first control unit that broadcasts speech data received from a mobile communication terminal to each of a plurality of other mobile communication terminals, and a second control unit that accumulates speech recognition results obtained by speech recognition processing of the received speech data in chronological order as a communication history between users, and controls text distribution so that the communication history is displayed synchronously on each of the mobile communication terminals;
an agent unit that receives detection information output from a state detection device of a monitored object and generates an agent utterance text based on the detection information;
The communication control unit broadcasts synthetic voice data of the agent utterance text generated by synthetic voice processing to each of the multiple mobile communication terminals, includes the received agent utterance text in the communication history between users, and accumulates it in chronological order, and controls text distribution to each of the mobile communication terminals.